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Methodology  and  tools  exist,  and  are  commonly  used  in  practice,  for  ana- 
lyzing and  estimating  the  coefficients  in  recursive  path  models,  based  on  linear 
relationships  with  continuous  endogenous  variables.  However,  the  methods  that  are 
traditionally  implemented  are  limited  by  requirements  for  linear  models,  as  well  as 
assumptions  of  independent  and  normally  distributed  error  terms. 

In  this  dissertation,  methodology  is  presented  that  extends  the  traditional  path 
model  method  of  the  "Calculus  of  Coefficients"  (COC).  An  analog  of  the  COC  is 
developed,  called  the  "Calculus  of  Effects"  (COE),  that  is  applicable  to  recursive 
path  models  involving  nonlinear  relationships  among  the  endogenous  variables. 
This  COE  methodology  results  in  a  partitioning  of  the  total  effect  into  sums  of  the 
direct  effect  and  all  indirect  effects  through  intermediate  variables  in  the  causal 
chain,  as  is  also  true  in  the  more  classical  COC. 


Techniques  for  estimation  and  testing  of  direct  and  indirect  effects  are  devel- 
oped. Estimates  of  effects  are  based  on  maximum  likelihood  estimation  and  may, 
under  certain  model  specifications,  require  Monte  Carlo  estimation  of  some  effects. 
Testing  of  effects  is  based  on  asymptotic  independence  of  parameter  estimates  for 
nonlinear  recursive  path  models,  which  is  developed  and  shown  herein. 

The  COE  is  also  presented  as  applied  to  models  strictly  containing  endogenous 
variables  that  are  dichotomous,  which  is  an  example  of  a  dependent  error  structure 
and,  thus,  an  extension  of  COC  to  models  without  requiring  an  assumption  of 
independent  error  terms.  Special  cases  of  models  containing  both  continuous 
and  dichotomous  variables  are  analyzed  and  discussed.  Applications  of  the  COE, 
parameter  estimation  and  testing  of  direct  and  indirect  effects  to  fields  of  maternal 
and  child  health  and  Alzheimer's  disease  are  included. 
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CHAPTER  1 
INTRODUCTION 

Historically,  path  models  and  their  associated  path  diagrams  have  been  very 
useful  methods  for  describing  interrelationships  among  causally  ordered  random 
variables  in  genetics  and  biology  [44,  74,  76],  social  sciences  [22,  31],  economics 
[44,  73],  and  many  other  subject  areas.  Such  interrelationships  can  be  described 
by  a  system  of  structural  equations  involving  the  random  variables  of  interest 
(called  endogenous  variables)  and  unknown  parameters.  The  endogenous  variables 
are  variables  whose  values  are  explained  by  other  variables  inside  the  system  of 
equations  (Kerlinger  and  Pedhazur  [39]).  These  equations  may  also  involve  other 
random  variables  (called  exogenous  variables,  those  with  variability  assumed  to  be 
determined  by  factors  outside  the  causal  model). 

In  describing  the  path  diagrams  and  structural  equations,  the  common 
notation  of  capital  letters  to  represent  random  variables  and  lower  case  letters  to 
represent  observed  values  will  be  utilized.  Thus,  path  diagrams  such  as  that  in 
Figure  1.1,  where  Yu  Y2  and  Y3  are  endogenous,  X  is  exogenous,  and  the  arrows 
indicate  direction  of  causality,  are  used  to  visually  convey  assumed  or  potential 
causal  relationships.  Because  the  sequence  of  variables  is  assumed  to  be  causally 
ordered,  each  variable  can  have  both  a  "direct  effect"  (DE)  on  any  subsequent 
variable  in  the  causal  chain  and/or  an  "indirect  effect"  (IE)  through  its  influence 
on  intermediate  variables.  The  primary  goal  in  most  applications  of  path  analysis  is 
to  estimate  these  direct  and  indirect  effects. 

For  example,  in  1.1,  the  variable  Y\  may  have  a  DE  on  variable  Y3,  represented 
by  a  direct  arrow  linking  the  two  variables.  Also,  Yx  may  have  an  IE  on  Y3  through 


Figure  1.1:  Path  Diagram 


y2-  This  IE  is  represented  by  the  direct  arrow  from  Yi  to  Y2  and  then  a  direct 
arrow  from  Y2  to  Y3. 

When  studying  "classical"  path  models,  that  is,  a  sequence  of  linear  models 
where  all  endogenous  variables  involved  are  continuous  and  normally  distributed, 
known  methodology  exists  for  estimating  and  interpreting  the  direct  and  indirect 
effects  (Duncan  [11],  Land  [42],  Li  [44]  and  Wright  [74]).  This  methodology  is 
based  on  a  system  of  linear  equations  in  which  the  parameters  in  the  path  model 
are  easily  estimated  and  interpreted. 

The  classical  path  analysis  methods  for  estimating  and  interpreting  direct  and 
indirect  effects  break  down  when  any  of  the  relationships  of  interest  are  nonlinear. 
This  is  usually  the  case,  for  example,  when  one  or  more  of  the  endogenous  variables 
in  the  causal  chain  is  discrete.  Situations  where  classical  methodology  fails  are 


presented  in  the  remainder  of  this  section  in  the  form  of  applied  examples  that 
motivated  this  research. 

The  first  motivating  example  comes  from  the  field  of  maternal  and  child 
health.  It  is  known  (Guyer  et  al.  [28])  that  black  infants  are  more  than  twice  as 
likely  as  whites  (13.7  per  thousand  liveborns  for  blacks  versus  6  for  whites)  to 
suffer  infant  mortality  (IM).  Also,  black  infants  are  more  than  twice  as  likely  (130 
per  thousand  liveborns  for  blacks  versus  65  for  whites)  to  experience  low  birth 
weight  (LBW,  birth  weight  less  than  2500  grams)  [28].  A  natural  causal  ordering 
is  formed  by  these  variables  and,  thus,  path  models  are  natural  for  describing  the 
interrelationships  between  race  of  the  infant  (IR  =  white/nonwhite),  LBW  (yes/no) 
and  IM  (yes/no),  while  controlling  for  one  exogenous  variable,  x,  say  mother's 
education  level  (or  collectively  used  to  denote  all  exogenous  variables).  The 
variables  IR,  LBW  and  IM  correspond  to  variables  Yi,  Y2  and  F3,  respectively,  in 
Figure  1.  We  wish  to  know  how  much  of  the  total  effect  of  IR  on  IM  is  attributed 
to  the  indirect  effect  of  IR  through  LBW  and  how  much  is  due  to  an  effect  directly 
on  IM.  To  study  and  interpret  these  direct  and  indirect  effects,  "classical"  path 
analysis  methodology  fails  due  to  the  lack  of  linearity,  continuity  and  normality  of 
the  endogenous  variables  involved. 

A  second  example  where  this  type  of  problem  arises  is  in  the  pharmaceutical 
industry  or  clinical  trial  setting  when  information  on  a  near-term,  potential 
surrogate  variable  (or  "intermediate  endpoint"  [15,  p.  167])  is  available  prior  to 
the  necessary  long-term  outcome  variable  of  interest.  By  studying  the  effects  of  a 
risk  factor  or  treatment  on  a  valid  surrogate  variable,  the  duration  of  studies  and 
clinical  trials  could  be  shortened.  A  potential  surrogate  variable  is  a  biological 
marker  or  event  that  may  be  assessed  or  observed  prior  to  the  clinical  appearance 
of  a  disease  or  particular  outcome,  and  potentially  bears  strong  relationship  to  the 
development  of  that  disease/outcome.  For  example,  assume  that  occurrence  or 


non-occurrence  of  a  stroke  is  causally  related  to  blood  pressure  level  and,  also,  to 
hypertension  drugs.  Here,  the  ultimate  outcome  variable  is  "stroke"  (yes/no)  and 
is  considered  the  "long-term"  outcome.  An  intermediate  variable,  the  potential 
"near-term"  surrogate  variable,  is  blood  pressure  level.  The  first  variable  in  the 
chain  is  "treatment"  (yes/no).  Here  again,  the  outcome  variable  is  dichotomous. 
The  initial  variable  in  the  chain,  "treatment,"  may  also  be  dichotomous.  The  goal 
is  to  determine  whether  the  intermediate  variable,  blood  pressure,  is  an  adequate 
surrogate  for  the  ultimate  outcome,  stroke.  A  necessary  condition  for  blood 
pressure  to  be  a  perfect  surrogate  is  that  the  total  effect  of  treatment  on  incidence 
of  stroke  is  an  indirect  effect  through  blood  pressure.  Thus,  one  goal  is  to  estimate 
this  indirect  effect.  Once  more,  traditional  path  analysis  methods  of  estimation 
and  interpretation  do  not  apply  in  this  situation  because  of  failure  in  the  linearity, 
continuity  and  normality  assumptions. 

A  third  example  where  traditional  path  analysis  methods  fail  is  when  there  is 
an  inherent  nonlinear  relationship  in  the  causal  chain  of  variables.  A  path  model 
such  as  this  can  be  seen  in  the  following  chain  of  three  variables: 

Y\    =    IH  +  ei  (1.1) 

Y»    =    fo  +  ftexp(ft^)+A$exp(/37y2)  +  e3  (1.3) 

where,  for  a  specified  country  or  nation, 

Fi  =  population  size, 

y2  =  median  income,  and 

F3  =  average  life  expectancy  of  individuals  in  the  population. 
The  nonlinear  relationships  specified  for  variables  Y2  and  Y3  above  create  situations 
that  cannot  be  analyzed  by  classical  path  methodology.  Relationships  such  as  these 


among  the  endogenous  variables  will  be  examined  in  more  detail  in  Chapter  5  of 
this  dissertation. 


CHAPTER  2 
LITERATURE  REVIEW 

2.1     Classical  Linear  Models 
2.1.1     Models  with  Standardized  Variables 

Classical  path  analysis  was  first  introduced  by  Sewall  Wright  in  1921  [72] 
to  the  field  of  genetics.  The  concepts  presented  in  his  introductory  paper  were 
rigorized  mathematically  in  a  follow  up  paper  in  1934  [74].  Following  Wright's 
conceptualization  and  notation,  subsequent  authors  reintroduced  the  topic  to  the 
fields  of  sociology  (Duncan  [11],  Goodman  [26]),  and  econometrics  (Li  [44]). 

These  authors  dealt  only  with  variables  involved  in  structural  linear,  causal 
relationships.  Relationships  such  as  these  are  typically  viewed  in  a  one-way 
causal  flow  within  the  system  of  equations.  For  example,  in  terms  of  standardized 
variables,  zt,  the  last  endogenous  variable  in  the  causal  chain  can  be  represented  by 
the  following  equation: 

Zo  =  PoiZi  +  p02z2  + h  pQnzn  (2.1) 

where,  for  example,  p0i  is  the  partial  standardized  regression  coefficient  between  zt 
and  zQ,  controlling  for  the  other  z's  in  Equation  2.1  above.  The  p0i  are  population 
"path  coefficients"  and  measure  the  fraction  of  the  standard  deviation  of  the 
dependent  variable  for  which  the  associated  standardized  variable,  zu  is  directly 
responsible.  Equation  2.1  is  the  last  in  a  set  of  structural  equations  of  the  form 

zi  =  Ej>iPijZj      1  =  0,1,  ...,n-l  .  (2.2) 


Figure  2.1:  Path  Diagram  Illustration  of  Coefficients 

Wright,  and  the  others,  used  the  path  coefficients  and  zero  order  population 
correlation  coefficients,  denoted  by  c^,  to  analyze  the  assumed  causal  relationships. 
Under  models  such  as  this,  each  c^  correlation  decomposes  into  a  single  direct 
path  coefficient  (e.g.,  p^)  plus  the  sum  of  products  of  coefficients  on  the  several 
compound  paths  representing  all  the  indirect  connections  allowed  by  the  diagram 
(Duncan  [11]).  Figure  2.1  illustrates  the  following  four  variable  path  model  using 
the  notation  above: 


Z2    =    P23Z3 

z\    =    P12Z2  +  P13Z3 

Zo      =     PoiZi  +  P02Z2  +  P03Z3  . 


(2.3) 
(2.4) 
(2.5) 


Wright  [74]  showed  that  the  concept  of  indirect  effects  seen  as  products  of 
path  coefficients  can  be  justified  via  substitution,  as  follows  when,  for  example, 
examining  the  effects  of  z2  on  z0.  The  substitution  of  Equation  2.4  into  Equation 
2.5  obtains 

Zq  =  (P01P12  +  Pm)z2  +  (poiPis  +  Po3)z3  ■  (2.6) 

From  Equation  2.6,  we  see  that  the  total  effect  of  22  on  z0  (i.e.,  P01P12  +  P02),  when 
controlling  for  z3,  can  be  written  as  the  sum  of  the  direct  effect  of  z2  on  z0  (i.e., 
P02)  and  the  indirect  effect,  which  is  defined  as  the  quantity  P01P12,  the  product  of 
path  coefficients.  Such  results  follow  from  a  general  rule  called  the  "Calculus  of 
Coefficients"  (COC)  for  classical  linear  path  models. 

Fienberg  [13,  p.  91]  discussed  this  method  for  linear  systems  further,  stating 
that  the  "calculus  of  path  coefficients"  allows  calculation  of  direct  and  indirect 
effects  associated  with  the  arrows  in  the  path  diagrams.  He,  however,  failed  to 
recognize  that  these  concepts  extend  beyond  classical  linear  path  models,  stating 
[13,  p.  91],  "For  the  analysis  of  categorical  variables:  (i)  there  is  no  calculus  of 
path  coefficients."  This  dissertation  shows  that  the  concept,  in  fact,  does  extend 
beyond  the  linear  model  to  more  general  path  models. 

2.1.2     Models  with  Unstandardized  Variables 

As  mentioned  by  Kerlinger  and  Pedhazur  [39],  the  method  of  classical  path 
analysis  reduces  to  the  solution  of  one  or  more  multiple  linear  regression  analyses. 
Thus,  the  idea  of  indirect  effects  as  the  product  of  path  coefficients  is  useful  not 
only  when  applied  to  cases  of  standardized  variables  and  coefficients,  but  extends 
to  models  which  use  the  original,  measured  variables  and  regression  coefficients, 
as  seen  in  work  by  Blalock  [7],  Heise  [30],  Kerlinger  and  Pedhazur  [39],  and 
Stolzenberg  [63].  As  suggested  by  Blalock  [7]  and  Heise  [30],  path  coefficients  are 


appropriate  if  one  wishes  to  generalize  to  a  specific  population,  but,  to  more  fully 
describe  "causal  laws"  [7,  p.  675]  and  relationships,  unstandardized  regression 
coefficients  are  more  appropriate.  Kerlinger  and  Pedhazur  [39,  p.  628]  agreed 
with  Duncan  [12]  when  the  former  stated  that  it  would  be  restorative  "if  research 
workers  relinquished  the  habit  of  expressing  variables  in  standard  form"  because 
standardization  tends  to  obscure  the  use  of  the  structural  coefficients  of  the  model. 
Henceforth,  this  research  focuses  on  the  "regression  models"  with  unstandardized 
variables  and  their  coefficients. 

In  the  case  of  standardized  models,  Wright  [74]  and  Li  [44]  provided  methodol- 
ogy to  calculate  indirect  effects  by  tracing  the  appropriate  paths  in  path  diagrams 
and  multiplying  the  associated  coefficients  along  those  paths.  Their  methodol- 
ogy derives  from  recursive  substitution  and  extends  to  regression  models.  These 
calculations  of  direct  and  indirect  effects  are  relatively  easy  to  perform  for  basic 
path  models.  However,  in  complex  models,  the  task  of  systematically  including  all 
necessary  paths  when  constructing  the  decomposition  of  a  total  effect  into  direct 
and  indirect  parts  can  be  overwhelming  and  error-prone.  To  deal  with  these  more 
complicated  models,  methods  using  matrix  algebra  were  developed  by  Fox  [14]  and 
further  discussed  by  Kerlinger  and  Pedhazur  [39]. 

It  is  interesting  and  constructive  to  note  that  such  path  models  are  special 
cases  of  linear  structural  equation  models  (Joreskog  [37]).  Let  Y  denote  apxl 
vector  of  interrelated  response  variables,  each  statistically  dependent  on  at  least 
the  corresponding  element  inapx  1  vector  of  random  errors  denoted  by  E*,  and 
let  X  be  a  q  x  1  vector  of  stochastic  explanatory  variables  that  are  independent  of 
E*.  Assume  that  E(E*)  =0  and,  without  loss  of  generality,  that  E(Y)  =  0  and 
E(X)  =  0.  Furthermore,  assume  that  the  structural  relationships  among  Y,  X,  and 
U  are  described  by  the  following  system  of  equations: 

B*Y  +  T*X  =  E*,  (2.7) 
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where  B*  is  a  p  x  p  matrix  of  coefficients  on  the  variables  in  Y,  and  r*  is  a  p  x  q 
matrix  of  coefficients  on  the  variables  in  X.  The  matrix  equation  2.7  defines  a 
system  of  p  equations,  the  ith  of  which  describes  an  assumed  linear  structural 
relationship  of  the  ith  variable  in  Y,  Yi,  with  variables  in  X  and  other  variables  in 
Y 

Equation  2.7  can  be  written  in  more  familiar  form  by  moving  all  but  the 
ba  Yi  term  in  the  ith  equation  to  the  right  hand  side  of  that  equation  and  then 
dividing  both  sides  by  ba,  for  each  i  =  1, 2,  . . .  ,p.  This  yields,  in  matrix  form,  the 
simultaneous  equations  model 

Y  =  BY  +  TX  +  E,  (2.8) 

where  B  =  (I- diag(l/6ii)B*),  T  =  -diag(l/6«)P,  E  =  diag(l/6ii)E*,  diag(l/6«) 
is  the  diagonal  matrix  with  1/6^  in  the  ith  diagonal  position,  i  =  1,2,  . . .  ,p,  and  6„ 
is  the  ith  diagonal  element  of  B*  in  Equation  2.7. 

A  special  case  of  the  model  in  Equation  2.8,  where  the  elements  of  E  are 
independent  and  B  is  lower-triangular  with  zeros  on  the  diagonal  (i.e.,  all  elements 
on  and  above  the  main  diagonal  are  zero)  has  been  studied  extensively.  Such 
systems  of  equations  are  called  recursive  because,  ignoring  errors  terms,  the  values 
of  the  endogenous  variables  are  determined  as  a  function  of  any  preceding  set  of 
variables  by  recursive  substitution  through  the  hierarchy  of  intermediate  equations. 
For  example,  if  B  in  Equation  2.8  is  lower  triangular  with  zeros  on  the  diagonal, 
then  Yi  is  determined  by  the  exogenous  variables,  X,  alone.  Thus,  Y2  is  determined 
by  substituting  the  values  of  Yx  in  terms  of  X  and  Ei  into  the  second  equation, 
and  Y3  by  substitution  of  Yi  into  Y2,  and  then  Yx  and  the  resulting  Y2  into  the 
third  equation,  etc.  Because  the  errors  in  recursive  equations  are  assumed  to  be 
independent,  endogenous  variables  are  independent  of  the  errors  in  equations 
where  they  appear  as  predictors  (although  dependent  on  the  entire  set  of  errors). 
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Models  in  which  either  B  is  not  triangular  or  the  elements  of  E  are  not  mutually 
independent  are  non-recursive.  The  recursive  linear  models  studied  in  this  proposal 
are  known  as  classical  path  analysis  models. 

As  above,  in  the  case  of  standardized  linear  models,  recursive  substitution 
can  be  used  to  show  that  the  COC  holds.  In  the  notation  of  these  more  general 
structural  equations,  Fox's  method  for  calculating  indirect  effects  and  direct  effects 
is  presented  now.  To  summarize  these  methods,  the  following  notation  for  a  path 
model  with  p  endogenous  and  q  exogenous  variables  will  be  utilized.  Y  is  the  p  x  q 
matrix  of  direct  effects  of  exogenous  variables  on  endogenous  variables.  B  is  the 
pxp  lower-triangular  matrix  of  direct  effects  of  endogenous  variables  on  subsequent 
endogenous  variables  in  the  causal  ordering.  Also,  let  Ip  denote  the  pxp  identity 
matrix. 

The  total  effects  of  the  exogenous  variables  on  the  endogenous  variables  is 
represented  by  the  matrix  Tyx  =  (I  -  B)_1r  (for  the  associated  derivations  see  Fox 
[14]).  The  matrix  Tyi  is  called  the  reduced  form  coefficient  matrix  (Johnston  [35]) 
and  contains  the  total  effects  of  X  on  Y.  Using  this  methodology,  the  total  effect 
is  defined  in  the  literature  to  be  the  sum  of  direct  and  indirect  effects.  Thus,  the 
matrix  of  indirect  effects  is  found  by  calculating  lyx  =  Tyx  —  T. 

Likewise,  for  the  effects  of  endogenous  variables  on  subsequent  endogenous 
variables,  we  can  calculate  total  effects  by  Tyy  —  (I  -  B)_1  -  Ip  =  (I  -  B)_1B 
and  indirect  effects  by  Iyy  =  Tyy  -  B  [14].  One  should  note  that  when  this  method 
is  applied  to  more  involved  models,  matrix  inversions  and  multiplications  may  be 
required  that  become  quite  complex,  and,  perhaps,  even  more  error-prone  than 
the  path  tracing  and  coefficient  multiplication  mentioned  earlier.  This  approach  is 
equally  applicable  to  both  path  coefficients  and  regression  coefficients. 

Miller  [48,  p.  330],  in  a  brief  but  thorough  over  view  of  path  analysis,  stated 
that  there  is  "an  absolute  lack  of  literature  pertaining  to  the  appropriate  use" 
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of  path  analysis.  To  help  solve  this  situation,  he  gave  six  basic  assumptions  that 
classical  path  methodology  should  follow.  Among  these  six  were 

1.  Change  in  one  variable  occurs  as  a  linear  function  of  changes  in  the  other 
variables  [48,  p.  332]. 

2.  The  usual  methodological  assumptions  for  regression  analysis  are  met,  which 
includes  the  assumption  that  variables  are  measured  on  an  "interval  level" 
[48,  p.  338]. 

Miller  noted  that  the  linearity  assumption  can  be  relaxed  by  performing  mathe- 
matical transformations  on  the  nonlinear  relations.  This  was  his  only  suggestion 
for  dealing  with  the  situation  of  nonlinearity  within  the  causal  system  of  equations. 
He  made  no  suggestion  for  dealing  with  variables  not  measured  on  an  interval 
scale,  such  as  dichotomous  variables,  or  for  situations  where  recursive  substitution 
yields  entry  of  error  term(s)  into  an  equation  in  a  nonlinear  manner  that  cannot  be 
transformed  into  a  linear  form. 

Many  other  authors,  as  well,  have  recognized  that,  along  with  the  complica- 
tions created  by  categorical  variables,  nonlinearity  also  creates  many  difficulties  in 
the  classical  methodology.  Karlin  [38,  p.  166]  declared  that  in  genetic  epidemiol- 
ogy, it  is  "especially  clear  that  natural  processes  ...  are  inherently  nonlinear." 
Thus,  as  realized  by  Wright  [74,  p.  182],  particularly  in  genetic  applications,  the 
"principal  complications  are  the  possibilities  of  nonlinearity  in  the  combination  of 
effects"  of  different  variables.  Thus,  he  only  studied  strictly  linear  relationships  or 
those  with  "negligible"  nonlinearity.  Wright  [74,  p.  204]  also  stated  that  for  the 
simultaneous  structural  equations  in  path  analysis,  applications  generally  yield 
equations  that  are  nonlinear  with  respect  to  the  unknown  path  coefficients,  "mak- 
ing it  impossible"  to  get  a  general  formula  for  analysis  via  mere  substitution.  He 
(Wright  [74,  p.  205])  suggested  "a  more  thorough  treatment"  of  the  topic. 
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Others,  also,  have  acknowledged  this  challenge.  For  example,  Heise  [30]  noted 
that  the  required  assumptions  for  classical  path  methodology  exist  infrequently 
in  sociology.  Goldfeld  and  Quandt  [23]  recognized  the  widespread  appearance 
of  nonlinearities  in  almost  all  economic  models  and  stated  that  the  problems  of 
nonlinearities  in  structural  models  have  not  been  "satisfactorily  treated."  Also, 
Bentler  [5,  p.  153]  stated,  "...  It  may  be  possible  to  develop  interesting  new 
nonlinear  structural  models." 

2.2     Nonlinear  Models  with  Continuous  Variables 
Literature  that  considers  the  case  of  continuous  variables  in  nonlinear  models 
is  quite  sparse.  One  author  that  addressed  this  particular  case  is  Stolzenberg  [63]. 
Stolzenberg  [63,  p.  459]  mentioned  that  "while  considerable  attention  has  been 
directed  to  linear  additive  models  .  .  .  nonlinear  models  have  been  neglected."  He 
proposed  methodology  that  would  "solve"  the  problem  of  nonlinearities  in  causal 
systems  of  equations.  He  defined  the  total  effect  of  an  "antecedent  variable"  on  a 
"consequent  variable"  as  the  partial  derivative  of  the  latter  variable  with  respect 
to  the  antecedent  variable,  after  full  substitution  of  all  equations  of  intervening 
variables  into  the  structural  equation  for  the  consequent  variable  [63,  p.  480].  As 
defined  by  Stolzenberg,  the  direct  effect  of  a  variable  on  a  consequent  variable  is 
defined  to  be  that  part  of  the  total  effect  not  transmitted  via  intervening  variables. 
Indirect  effects  are  calculated  using  the  proper  substitutions  along  with  the  chain 
rule  for  partial  derivatives.  Using  these  definitions  and  their  associated  derivatives, 
the  total,  direct  and  indirect  effects  can  be  calculated.  Stolzenberg  noted  that,  in 
these  nonlinear  structural  equations,  the  total  effect  can  be  decomposed  into  the 
sum  of  direct  and  indirect  effects.  This  is  a  very  important  fact  to  note  because 
Stolzenberg  recognized  methodology  that  applies  to  both  the  classical  path  analysis 
cases  as  well  as  nonlinear  cases,  a  point  that  was  not  previously  recognized  in  the 
literature.  However,  Stolzenberg  did  not  recognize  that,  as  defined  by  his  notation, 
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"total  effect"  is  a  function  of  random  errors  and,  thus,  is  not  a  good  definition  of  an 
"effect."  In  particular,  how  can  we  take  a  derivative  of  a  random  quantity,  i.e.  an 
error  term?  Later  in  this  dissertation,  it  is  proposed  to  take  expectations  in  order 
to  define  total  effects  that  are  not  random  quantities. 

Stolzenberg's  methodology  for  calculating  these  effects  is  accurate,  and  is 
simple  for  models  involving  variables  that  are  linear  in  the  path  coefficients. 
However,  when  the  path  coefficients  enter  into  the  equation  in  a  nonlinear  form 
the  resulting  total,  direct  and  indirect  effects  can  be  somewhat  complicated. 
He  considered  only  very  basic  examples  of  variables  entering  into  the  system  of 
equations  in  a  nonlinear  fashion. 

Using  one  of  these  basic  models,  consider  the  following  set  of  structural 
equations.  (Note  that  the  ordering  of  the  subscripts  for  the  Y  variables  is  now  Yi 
as  the  first  variable  in  the  chain,  Y2  and  Y3  as  intermediate  variables  and  then  Y4  as 
the  last  variable  in  the  chain.  This  is  a  reversal  from  the  ordering  used  by  Wright 
and  Li  mentioned  in  Equations  2.3  -  2.5.  This  reverse  ordering  will  be  utilized 
throughout  the  remainder  of  this  proposal.): 

Y2  =  #20  +  #21*1  +  e2  (2.9) 

Y3  =  #30  +  #31  Yl  +  #32  In  Y2  +  e3  (2.10) 

Y4  =  #40  +  #41  Yi  +  #42Y2  +  #43Y3  +  #44^2  +  e4  •  (2.11) 

Substitution  of  Equation  2.10  into  Equation  2.11  yields 

Yi      =      (#40+#43#30+#43e3)  +  (#43#31+#4l)Y1  (2.12) 

+#42  Y2  +  #43#32  ^  Y2  +  #44  YX  Y2  +  64  . 
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Hence,  taking  the  derivative  of  Y4  with  respect  to  Y2  gives 

dY 

-A=I342+  p4Ay1  +  &3/V^2  (2.13) 

which  is  defined  by  Stolzenberg  to  be  the  TE  of  Y2  on  Y4  .  (Again,  note  here  that, 
using  Stolzenberg's  definition  of  TE  requires  the  calculation  of  the  derivative  of  a 
random  variable  with  respect  to  a  random  variable.  Also,  what  is  the  derivative 
of  a  random  quantity,  more  specifically,  e4,  with  respect  to  the  variable  F2?)  The 
quantity  on  the  right-hand  side  of  Equation  2.13  represents  a  sum  of  the  quantities 
At2  +  PuYi  (the  direct  effect  of  Y2  on  Y4)  and  /?43/W*2  (the  indirect  effect  of  Y2  on 
Y4  through  Y3). 

The  concept  presented  by  Stolzenberg  can  be  appealing  under  certain  model 
specifications,  as  will  be  discussed  in  Section  7.1.  However,  the  basic  causal  systems 
approached  by  his  methodology  are  not  always  realistic.  Much  more  complicated 
systems  need  to  be  addressed.  An  example  of  such  a  complicated  system  is 
considered  in  the  following  system  of  equations: 

*2  =  Wl  +  €2  (2.14) 

F3  =  exp(/?32*2  +  faYx)  +  e3  .  (2.15) 

Substitution  of  the  right-hand  side  of  Equation  2.14  into  Y2  in  Equation  2.15 
yields  the  following: 

Y3  =  exp(/332 (021*1  +  e2)  +  pzlYx)  +  e3  .  (2.16) 

Equivalently, 

Y3  =  exp(/?32/S21F1  +  031yi  +  p32t2)  +  e3  .  (2.17) 


1C 
Therefore,  using  Stolzenberg's  definition,  the  total  effect  of  Yi  on  Y3  is 

dY 

-77^  =  (fta&i  +  fti)  exp(/532^1F1  +  faYi  +  /W  •  (2.18) 

Note  that  this  formula  for  the  total  effect  is  a  function  of  the  unknown 
random  quantity  e2.  This  result  presents  complications  in  both  the  analyses  and 
interpretations  of  the  effects  of  Y\  on  F3. 

Stolzenberg  did  not  address  these  complications  thoroughly.  In  particular,  he 
did  not  discuss  how  these  error  terms  in  exponential  functions  are  to  be  interpreted 
or  assessed.  He,  however,  did  discuss  using  an  "instantaneous  rate  of  change" 
of  a  variable  when  dealing  with  this  type  of  situation,  as  opposed  to  the  usual 
derivative  or  partial  derivative  [63,  p.  475].  A  derivative  is  viewed  as  the  rate  at 
which  the  dependent  variable  changes  per  change  in  the  independent  variable.  The 
"instantaneous  rate  of  change"  views  the  change  in  Y  on  a  proportional  basis  [63, 
p.  478].  This  change  is  defined  to  be    Y'yx  and  measures  the  proportional  change 
in  Y  per  unit  change  in  X.  This  proposed  method  results  in  much  simpler  forms 
for  the  partial  derivatives  when  dealing  with  exponential  functions  that  can  be 
made  linear  by  transformation.  However,  his  suggestion  to  use  proportions  is  not 
applicable  to  systems  like  those  of  Equations  2.14-2.15,  or  systems  that  cannot  be 
transformed  into  linear  systems,  because  his  method  does  not  solve  the  problems 
encountered  when,  with  recursive  substitution  to  eliminate  intermediate  variables, 
the  errors  in  antecedent  equations  enter  nonlinearly  into  subsequent  equations. 

More  specifically,  using  the  methodology  proposed  by  Stolzenberg  [63]  as 
applied  to  Equations  2.14  and  2.15,  we  obtain 
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dY3/Y3  dYz/dYx 

(A>2&1  +  fol)  exp(/?32/32in  +  flu*!  +  /W2)  ^  ^ 

exp(/532i52ir1  +  ^31^1  +  /332C2)  +  e3 
«    &2&1  +  fti  ■  (2.21) 

Thus,  we  have 

*3 

only  if  e3  — >•  0.  So,  we  see  that  Stolzenberg's  method  does  not  eliminate  the  error 
term  e3  and,  thereby,  does  not  allow  for  the  cancelation  of  the  common  exponential 
terms  that  occur  in  both  the  numerator  and  denominator  in  Equation  2.20  above. 
However,  first  using  conditional  expectations,  as  proposed  by  this  dissertation, 
and  then  applying  Stolzenberg's  concept  of  ratios  and  proportionality  to  this  same 
system,  we  see  that 

dE(Y3\yi)/dyi  dh/dy, 


E{Yz\yi)  f3 


where 


/3  =  exp(/332p2iyi  +  An  2/i)  ■ 

Goldfeld  and  Quandt  [23]  also  addressed  the  issue  of  nonlinearity  in  simultane- 
ous equations,  considering  only  special  cases  of  non-recursive  models.  They  did  not 
address  general  recursive  models  and,  hence,  failed  to  acknowledge  the  problems 
in  interpreting  and  estimating  direct  and  indirect  effects  that  are  encountered  in 
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recursive  systems  of  equations  with  nonlinearities,  both  of  which  are  addressed  by 
this  research. 

Heise  [30]  recognized  that  nonlinear  equations  within  a  system  of  equations 
for  path  analysis  create  problems  that  do  not  exist  with  purely  linear  systems. 
However,  he  made  no  proposals  regarding  functions  that  are  not  "linearizable." 
He  only  discussed  nonlinear  functions  that  can  be  transformed  to  linear  functions. 
These  transformation  "solutions"  do  not  solve  the  problems  that  arise  in  path 
systems  with  recursive  substitution.   "Transformable"  linear  functions  suffer  from 
the  same  ills  as  non-transformable  functions  when  substituting  back  into  the  path 
system  of  equations.  Particularly,  consider  the  following  system  of  equations: 

Y2  =  £20  +  P21Y1  +  e2  (2.22) 

Y3  =  Aso  +  #32  In  Y2  +  e3  .  (2.23) 

Note  that  F3  and  In  Y2  are  linearly  related.  Consequently,  a  transformation  can 
be  implemented  to  make  Equation  2.23  a  purely  linear  function.  Transformation 
and  substitution  of  F2*  =  lnF2  into  Equation  2.23  yields  an  equation  that  is  linear 
in  all  variables.  However,  if  we  wish  to  study  the  direct  effects  of  variable  Y2  on  Y3 
and,  perhaps,  the  indirect  effects  of  Y2  through  Y3  on  variables  following  F3,  a  mere 
transformation  to  eliminate  the  term  In  Y2  does  not  solve  the  problems  encountered 
by  the  nonlinearity.  More  specifically,  a  transformation  does  not  eliminate  the 
nonlinear  situation  that  occurs  via  substitution  of  Equation  2.22  directly  into 
Equation  2.23,  as  seen  by 

Y3  =  &o  +  As2  ln(&o  +  faxYi  +  c2)  +  e3  .  (2.24) 
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Equation  2.24  above  still  contains  an  error  term  (62)  in  a  nonlinear  relationship 
and  is  still  a  nonlinear  function  of  Y\.  Hence,  Heise  failed  to  recognize  that 
transformations  do  not  solve  these  problems  that  develop  when  substituting  terms 
from  preceding  variables  into  a  nonlinear  equation  for  a  subsequent  variable,  as  is 
discussed  within  this  dissertation. 

2.3     Models  with  Non-Continuous  Variables 
Methods  for  analyzing  non-continuous  variables  in  path  diagrams  have  been 
approached  by  various  authors  including  Goodman  [24-26],  Holland  [31],  Wilson 
and  Bielby  [70],  and  Winship  and  Mare  [71].  Goodman  [24-26]  proposed  an  analog 
to  path  analysis  suited  to  the  case  where  the  variables  are  polytomous  or  dichoto- 
mous.  His  methods  are  used  for  model  testing,  partitioning  the  test  statistic  into 
components  useful  for  testing  sub-models,  and  parameter  estimation.  Goodman's 
methods  are  based  on  log-linear  or  logit-linear  models.  In  his  methodology,  the  log 
of  conditional  odds  is  expressed  as  a  linear  function  of  the  general  mean  and  main 
effect  variables.  For  example,  for  main  effect  variables  A,  B  and  C,  he  assumed 
that  the  log  of  conditional  odds  of  C  given  A  and  B,  logQ,^8^,  can  be  written  as 

log  Q$*>  =  log  ^  +  log  7f  +  log  yf 

where  log7C  represents  the  general  mean  and  the  parameters  log7/lc  and  log  jfc 
represent  the  main  effects  of  variables  A  and  5,  respectively,  on  C. 

Iterative-scaling  methods  are  used  to  calculate  estimated  frequencies  for 
contingency  tables.  These  are  used  to  calculate  the  estimates  of  the  7  parameters 
and,  therefore  log 7,  which  is  Goodman's  version  of  a  path  coefficient.  Goodman 
stated  that  his  methodology  for  analyzing  relations  among  polytomous  variables 
is  "somewhat  analogous"  to  the  methodology  used  in  classical  path  analysis  for 
"quantitative  variables"  [26].  However,  Goodman  also  stated  that  his  methodology 
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does  not  allow  for  use  of  the  basic  theorems  and  ideas  of  path  analysis  as  given  by 
Wright  [74-75].  That  is,  Goodman's  system  of  equations  and  methodology  allow 
for  estimation  of  direct  effects  only  and  do  not  allow  for  the  implementation  of 
the  "calculus  of  path  coefficients."  Therefore,  without  an  analog  of  the  COC,  no 
estimation  of  indirect  effects  is  available,  which  is  a  main  goal  of  this  proposal. 

Holland  [31]  studied  path  models,  and  their  associated  systems  of  equations, 
that  involve  only  linear  functions  of  variables.  Using  his  notation,  systems  such  as 
the  following  were  considered: 

E(R\S  =  s)    =    as  +  d 
E(Y\S  =  s,  R  =  r)    =    bs  +  cr  +  d' 

where  Y  is  the  outcome  variable,  R  is  the  intermediate  variable  and  S  is  the  initial 
variable  in  the  causal  chain.  He  followed  by  calculating  conditional  expectations  to 
obtain 


E(Y\S)    =  E[E(Y\S,  R)\S) 

=  E(bS  +  cR  +  d'\S) 

=  bs  +  cE(R\S)  +  d' 

=  bs  +  c(as  +  d)  +d' 

=  (b  +  ac)s  +  dc  +  d' . 

Holland  declared  the  total  effect  of  S  on  Y  to  be  the  quantity  b  +  ac  by  inspection. 
He  [31,  p.  456]  noted  that,  "In  general,  E(Y\R  =  r,  S  =  s)  need  not  be  linear  in 
r  and  s."  His  view  of  total  effect  as  a  conditional  expectation  is  proper,  however, 
his  statement  regarding  nonlinearity  is  oversimplified.  He  did  not  recognize  the  fact 
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that,  if  error  terms  enter  into  a  subsequent  equation  in  a  nonlinear  fashion,  these 
conditional  expectations  are  no  longer  simple  and  easy  to  calculate.  Situations  such 
as  these  are  a  goal  of  this  research. 

Wilson  and  Bielby  [70]  presented  an  extension  of  the  logic  of  structural 
equation  methodology  to  be  applied  to  recursive  systems  allowing  categorical  and 
continuous  variables  as  either  exogenous  and/or  endogenous  factors.  For  example, 
when  considering  a  two-equation  system  of  purely  categorical  data  with  Y^  as 
the  rth  endogenous  variable  and  Z^  as  the  sth  endogenous  variable,  where  the  Y 
variables  depend  only  on  the  exogenous  variables  X,  but  the  Z  variables  depend 
on  both  the  Y  variables  and  X,  they  let  n^  =  P(Y^r)  =  1|X)  represent  the 
probability  that  the  categorical  variable  Y^  obtains  a  value  of  one,  conditional  on 
fixed  values  of  X.  Hence, 

E{Y^\X)  =  ¥r\Xp{0\  ...,  XpW) 

where  *(r)  might  be  used  to  represent  the  logistic  or  log  function,  or  used  to 
represent  a  general  linear  model  where  n^  =  Xa(r).  Likewise,  let 

r«    =    P{Z(s)  =  1|X,  Y) 

=    XgW+Yf'J.  (2.25) 


This  leads  to  the  conditional  expectation  of 


R 


e(zm\x)  =  ^2e(zM\x,  r<r>  =  i)p(y«  =  i|x) 

r=0 

R 

=    #(Z«|X,  Y<°>  =  1)tt<°)  +  ^E(ZM\X,  Y^  =  l)n^ 
Noting  that  »<°)  must  equal  1  -  £f=1  n(r),  rearranging  terms  yields 
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R 


E{Z{s)\X)    =    E(Z(s)\X,  Y{0)  =  1)  +  Y^[E(Z{s)\X,  F(r)  =  1) 


r=l 


-E(Z^\X,Y^  =  l)]ir^. 
Substitution  with  terms  representing  a  linear  model,  they  showed 
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E{Z{s)\X)    =    X/?(s)  +  J](X/?W  +7(»r)  -  X£(s))Xc*(r)  (2.26) 

r=l 

=    X(/3(s)  +  ^7(sr)a(r>).  (2.27) 

r=l 

Thus,  Wilson  and  Bielby  showed  that  the  coefficients  in  Equation  2.27  above 
are  the  direct  effects,  ^s\  plus  the  sum  of  the  indirect  effects,  ^sr^a^r\  through 
the  Y^\  . . . ,  Y(RK  They  noted  that,  in  the  linear  case,  these  equations  have 
the  same  form  and  interpretation  as  for  recursive  linear  models  with  continuous 
variables.  One  should  observe  that  Wilson  and  Bielby  model  categorical  variables 
linearly,  as  in  Equation  2.25,  which,  in  general,  leads  to  a  misspecified  model. 
Also,  it  should  be  noted  that  Wilson  and  Bielby's  discussion  was  limited  to, 
using  their  terminology,  "quasi-linear"  models,  that  is,  regression  functions  that 
are  dependent  strictly  upon  a  linear  combination  of  the  predetermined  variables 
and  their  associated  parameters  [70,  p.  110].  Thus,  functional  relationships  of  a 
nonlinear  form  in  recursive  systems  were  not  addressed. 

Winship  and  Mare  [71,  p.  55]  stated  that  "causal  models  constructed  from 
log-linear  and  logit  models  have  limitations  and  ...  are  not  directly  analogous 
to  causal  models  with  continuous  variables."  Noting  that  inconsistent  treatment 
of  intervening  variables  in  discrete  systems  does  not  allow  the  usual  "theorem  of 
path  analysis"  [11,  p.  56]  to  be  directly  applied,  Winship  and  Mare  proposed  a 
solution  to  the  problem  of  incorporating  discrete  variables  into  causal  models, 
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where  the  discrete  variable  can  be  either  the  outcome  variable  or  an  intermediate 
variable.  They  showed  how  direct  and  indirect  effects  for  discrete  variables  can  be 
calculated  with  extensions  of  "continuous"  path  methods.  Their  methods,  which 
are  generalizations  of  structural  equations  and  classical  path  analysis  methods, 
break  the  TE  of  one  variable  on  another  down  into  the  sum  of  the  DE  plus  the  IEs. 
As  an  example,  a  two  equation  model  for  the  ith  observation  of  outcome  variable 
Z,  with  Y  as  the  intermediate  variable,  was  given  by  Winship  and  Mare  as 

Y  =  p0+plX  +  eY 
and 

Z  —  a0  +  aiX  +  a2Y  +  ez 

where  cov(ez,  eY)  =  0.  Winship  and  Mare  denoted,  as  we  follow  here,  that  the  TE 
of  X  on  Z  is  represented  by  the  quantity  dE{f^X) .  They  explained  that  this  TE  can 
be  viewed  as  the  expected  effect  for  a  group  of  observations  with  a  common  value 
of  X,  i.e.,  the  average  for  the  population  of  observations  with  the  same  values  on 
X.  In  systems  where  Z  is  a  dichotomous  variable  (either  an  endogenous  variable 
within  the  system  or  the  final  outcome  variable),  Winship  and  Mare  modeled  the 
quantity  P(Z  =  1)  =  F(a0  +  axX  +  a2Y)  where  F  represents,  for  example,  the 
logistic  or  cumulative  normal  distribution,  and  obtain  the  TE  of  X  on  P(Z  =  1) 
to  be  the  quantity  e*i/(a0  +  axX  +  a2Y)  +  a2[f(a0  +  axX  +  a2Y)]/3u  which 
represents  the  sum  of  direct  and  indirect  effects,  respectively,  of  X  on  P(Z  =  1). 
Winship  and  Mare  only  considered  linear  systems  of  variables,  which,  generally, 
are  inappropriate  specifications  when  considering  discrete  variables.  They,  like 
various  other  authors,  did  not  consider  situations  where  these  variables  and  their 
error  terms  enter  in  a  nonlinear  fashion  into  subsequent  equations  in  the  system. 
Also,  Winship  and  Mare  failed  to  mention  an  important  consequence  of  their 
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methodology.  More  specifically,  what  are  the  implications  and  what  do  they  mean 
by  the  term  — jj-j-M  when  X  is  a  discrete  variable? 

Cox  and  Wermuth  [10,  p.  441]  presented  models  that  included  mixtures  of 
binary  and  quantitative  variables  and  their  relation  to  "graphical  chain  models." 
However,  their  discussion  was  limited  to  include  only  "conditional  Gaussian 
regression  chain  models"  [10,  p.  441],  models  derived  from  underlying  multivariate 
normal  distributions  and  models  in  which  the  probabilities  for  the  dichotomous 
variables  are  specified  linearly.  Hence,  their  methodology  applies  only  to  situations 
where  the  variables  are  "linearizable."  This  methodology  suffers  from  the  same 
problems  as  discussed  previously  regarding  Heise  [30]  in  Section  2.2  and,  thus, 
is  not  a  focus  of  this  research.  Also,  their  methods  concentrate  on  testing  for 
independence  between  variables,  not  the  analysis  of  causal  chains  along  with  the 
associated  parameters  and  interpretations,  which  are  a  focus  of  this  dissertation. 

Freedman,  Graubard  and  Schatzkin  [15]  studied  paths  of  variables  in  the 
setting  of  "intermediate  endpoints"  [15,  p.  167]  for  chronic  diseases.  These  paths 
allow  for  consideration  of  a  categorical  treatment  or  risk  factor,  followed  by  a  dis- 
crete or  continuous  intermediate  endpoint  and  then  a  binary  outcome  variable,  such 
as  disease  (yes/no).  Freedman  et  al.  used  criterion  for  validation  of  an  intermediate 
endpoint  based  on  the  fact  that  the  treatment  effect  on  disease,  adjusted  for  the 
intermediate  endpoint,  is  equal  to  zero.  In  path  analysis  terminology,  this  implies 
that  the  TE  of  treatment  on  disease  equals  the  IE  of  treatment  on  disease  through 
the  intermediate  endpoint. 

Using  the  notation  in  Freedman  et  al.,  if  the  intermediate  endpoint  S  captures 
the  dependence  of  outcome  variable  T  on  the  treatment  group  X,  then 

P(T=1\S,X)  =  P(T=1\S). 
Hence,  the  associated  model  is 
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g(P(T=l\S,X=j))  =  h(S)+Tj 

where  the  term  h(S)  represents  some  function  of  S,  perhaps  h(S)  =  /i  +  Oi  where 
S  =  8i  ,  i  =  1,  ...,  k,  and  5  may  take  on  k  values.  The  parameter  r,-  represents 
the  jth  treatment  effect.  Two  required  restrictions  are  that  V  •  Tj  =  0  and 
J2i  °i  —  0-  Freedman  et  al.  used  the  notation  of  ?i  to  represent  the  estimated 
effect  of  treatment  1  on  the  outcome  and  rla  to  represent  the  estimated  effect  of 
treatment  1  on  the  outcome,  adjusted  for  the  intermediate  endpoint.  These  are 
the  analogs  of  "total  effect"  and  "direct  effect" ,  respectively,  as  denoted  in  this 
proposal.  Also,  the  quantity  1  -  ^  estimates  the  proportion  of  the  treatment 
effect  explained  by  the  intermediate  endpoint.  This  quantity  is  used  to  see  if 
the  data  provide  convincing  evidence  that  the  intermediate  endpoint  explains  a 
significant  proportion  of  the  treatment  effect.  It  should  be  noted  that  Freedman  et 
al.  assumed  that  the  TE  of  the  treatment  is  the  sum  of  direct  and  indirect  effects 
by  definition  without  justification.  Justification  is  a  goal  of  this  dissertation. 


CHAPTER  3 
THE  CALCULUS  BEHIND  THE  "CALCULUS  OF  COEFFICIENTS" 

3.1     Basic  Notation 

One  goal  of  this  research  is  to  generalize  ideas  from  classical  path  analysis 
using  well-known  mathematical  results.  To  bridge  the  gap  from  classical  path 
methodology  to  the  nonlinear  and  discrete  cases,  and  thereby  provide  a  more 
unifying  view  of  path  analysis,  we  present  the  calculus  that  underlies  classical  path 
analysis  and  the  "Calculus  of  Coefficients"  (COC).  Surprisingly,  the  link  between 
this  calculus  and  the  COC  seems  not  to  be  recognized  in  the  literature. 

We  begin  by  reviewing  some  basic  notation  of  calculus.  We  utilize  the  notation 
as  given  by  Milne-Thomson  [49].  First,  a  difference  quotient  is  denoted  as  the 
quantity 

a„/W  =  m±«L=m.  (3,) 

This  quantity  is  defined  for  all  functions,  /,  and  for  all  choices  of  h  such  that 
(y  +  h)  €  By,  where  By  represents  the  domain  of/.  Let  Hy  denote  the  set  of  all  /i's 
satisfying  this  condition  for  the  given  y.  When  there  is  no  ambiguity,  the  subscript 
of  y  in  the  A  notation  will  be  ignored  for  simplicity  in  notation. 

For  a  continuous  function,  /,  the  derivative  of/  with  respect  to  y  is  defined  by 

^limA/fe),  (3.2) 

provided  the  limit  exists  at  y. 

It  should  be  noted  that  the  "A"  operator  applies  for  either  continuous 
or  discrete  valued  y  variables.  This  notation  will  be  particularly  useful  when 
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discussing  functions  of  discrete  variables.  Note  that  Af(y)  is  analogous  to  the 
derivative  operator  ^p.  when  Dy  is  discrete. 

Now,  consider  /(y)  where  y  =  (yu  y2,  . . . ,  yp)'  and  the  yk,  k  =  1,  2,  . . . ,  p  are 
independent  variables.  The  partial  difference  quotient  with  respect  to  yk  is  defined 

by 

A    jf   n  _  /(^l.  V2,  ■■■,Vk  +  h,  yk+l,  ...,yp)-  /(y) 

and  is  defined  for  all  functions  f,  where  hk  is  any  value  such  that  (yk  +  hk)  £  Dyfc 
and  y  G  D  =  DyixDy2x  •  •  -xDyp,  that  is,  D  represents  the  p-dimensional  domain 

of/(y). 

A  partial  derivative  of  /  with  respect  to  yk  is  denoted  M-  and  is  defined  by 

provided  this  limit  exists,  where  /  is  a  function  of  y,  yu  y2,  . . . ,  |/fc_i,  yk+i,  . . . ,  yp 
are  held  constant  and  differentiation  occurs  with  respect  to  yk.  As  defined  above, 
derivatives  and  partial  derivatives  apply  only  for  functions  of  continuous  variables, 
while  difference  quotients  and  partial  difference  quotients  are  defined  generally. 

3.2     The  Multi- Variable  Chain  Rule 
Recall  that  the  Chain  Rule  (CR)  for  functions  of  a  single  variable  gives  the 
rule  for  differentiating  a  composite  function.  We  follow  the  notation  given  in 
Anton  [3]  and  Stewart  [62].  If  u  =  f{yx)  and  yx  =  g(t)  where  /  and  g  are  both 
differentiate  functions,  then  u  is  indirectly  a  differentiable  function  of  t  and,  by 
the  CR  for  compound  functions, 

du       du  dyi 
dt       dy\  dt 
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Likewise,  if  u  =  f(yi,  y2)  is  a  differentiable  function  of  t/i  and  y2  where  y\  = 
g(t)  and  y2  =  h(t)  are  both  differentiable  functions  of  t,  then  u  is  a  differentiable 
function  of  t  and,  by  the  CR  for  multi- variable  compound  functions  (MVCR),  we 
have 

du       du  dyi       du  dy2 
dt       dyi  dt       dy2  dt 

In  general,  a  similar  rule  applies  in  the  case  where  u  is  a  differentiable  function 

of  the  n  variables  y1;  y2,  . . . ,  yp,  where  each  yk,  k  =  1,  2,  . . . ,  p,  is  itself  a  function 

of  t.  That  is, 

du  _  du  dyj       du  dy2  du_dyp 

dt       dyi  dt       dy2  dt  dyp  dt 

Note  that  within  Equation  3.4  above,  the  CR  may  be  applied  again  to  evaluate 

each  -jfc  if  yk  itself  is  a  compound  function  of,  say,  wx,  w2,  . . . ,  wr,  where  each  uij, 

j  =  1,  2,  . . . ,  r,  is  a  function  of  t. 

3.3     Leibniz's  Rule 

Using  notation  and  terminology  presented  in  Chapter  18  of  Taylor  and  Mann 
[65],  we  now  review  a  theorem  that  will  be  important  for  future  developments  in 
this  research.  This  law  is  presented  as  Theorem  XIV  in  Taylor  and  Mann  [65]  and 
is  commonly  known  as  "Leibniz's  Rule." 

Consider  all  »,,  y  and  zh  i  =  1,  2,  . . . ,  px,  j  =  1,  2,  . . . ,  p2,  defined  on 
the  (pi  +  p2  +  l)-dimensional  Euclidean  space,  denoted  by  RP1+P2+1,  such  that 
a-i  <  Xi  <  bi,  c  <  y  <  d  and  tj  <  Zj  <  uij.  Suppose  that  f(x,  y,  z)  is  an 
integrable  function  of  x  =  (xu  x2,  . . . ,  xpJ  for  all  values  of  y  and  further  suppose 
that  the  partial  derivative  df{x^'x)  exists,  is  a  continuous  function  of  x,  y  and 
z  =  (Zl,z2,  ...,  zP2)',  and  is  in  R«^»+1.  Then  F(y,  z)  =  /x  /(x,  y,  x)  rfx  has 
partial  derivative  with  respect  to  y  given  by 
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?'(y,  ■)  =  / 


5/(x,,,z)dX| 


3.4    The  Calculus  of  Finite  Differences 
Jordan  [36,  p.  1]  explained  that  there  are  two  sorts  of  functions  to  be  distin- 
guished. One  sort  is  when  the  variable  x  is  continuous,  the  other  when  x  is  dis- 
crete. When  writing  about  the  former  such  functions,  he  stated,  "These  functions 
belong  to  the  domain  of  Infinitesimal  Calculus"  and  then  added  the  following: 
Secondly,  functions  in  which  the  variable  x  takes  only  the  given 
values  x0,xi,  . . . ,  xn;  then  the  variable  is  discontinuous.  To  such 
functions  the  methods  of  Infinitesimal  Calculus  are  not  applicable.  The 
Calculus  of  Finite  Differences  deals  especially  with  such  functions.  .  . 
Recall  that  the  product  rule  as  applied  to  continuous  valued  y  variables  and 
differentiate  /  and  g  (Anton  [3])  is 


~  [f(y)g(y)]  =  /(»)-  [g(y)}  +  g(v)±  [/(</)] .  (3.5) 

The  analog  of  the  product  rule  for  functions  on  a  discrete  domain  (Jordan  [36])  is 


&[f(y)g(y)]  =  g(y)  A  f(y)  +  f(y  +  h)  a  g(y) ,  (3.6) 

where  the  A  operator  is  defined  as  in  Equation  3.1  and  hi  is  chosen  such  that  the 
quantity  {y  +  hi)  is  in  the  domain  of/.  Note  that  possible  values  of  hi  depend  on  y. 
It  should  also  be  noted  that  reversing  the  order  of  g(y)  and  f(y)  gives 

&[g(y)f{y)]  =  f(y)  a  g(y)  +  g(y  +  h2)  a  f(y)  (3.7) 

where  h2  is  again  dependent  upon  values  of  y  and  is  chosen  such  that  (y  +  h2)  is  in 
the  domain  of  g. 
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This  rule  can  be  repeatedly  applied  to  calculate  the  difference  quotient  of  a 
product  of  more  than  two  functions.  For  example, 

A[f(y)g(y)m(y)]    =    [g{y)m{y)]Af{y)  +  f(y  +  h1)A\g(y)m{y)]         (3.8) 
=    [g(y)m(y)]Af(y)  (3.9) 

+f(y  +  MMj/)  a  g(y)  +  9(y  +  h2)  a  m(y)] 

where,  again,  (y  +  hx)  is  in  the  domain  of/  and  (y  +  h2)  is  in  the  domain  of  g. 

Other  properties  (Milne-Thomson  [49])  of  the  A  operator  that  will  be  useful 
are 

1.  A  is  a  linear  operator, 

A[a/(y)  +  bg(y)]  =  a  A  f(y)  +  b  A  g(y) 

where  a  and  b  are  constants;  and 

2.  the  Index  law  is 

Am[An  f(y)]  =  A™+"/(y) 

where  m  and  n  are  positive  integers  representing  m  and  n  repetitions  of  the 
A  operator,  respectively. 

Properties  similar  to  those  given  in  Equations  3.6-3.9  hold  for  partial  difference 
quotients. 


CHAPTER  4 

AN  ALTERNATIVE  AND  GENERALIZABLE  DERIVATION  OF  THE 

CALCULUS  OF  COEFFICIENTS 

4.1     Classical  Linear  Path  Models 
4.1.1     The  Model 

In  Equation  2.8,  we  represented  a  system  of  simultaneous  linear  structural 
equations  using  matrix  notation  as 

Y  =  BY  +  TX  +  E , 

where  Y  is  a  vector  of  p  endogenous  variables,  Yi,  Y2,  . . . ,  Yp,  X  is  a  vector  of 
exogenous  variables,  Xx,  X2,  ...,  Xq,  and  E  is  a  p  x  1  vector  of  random  errors. 

The  usual  assumptions  of  structural  equations  in  classical  path  analysis  models 
(Johnson  and  Wichern  [34],  Joreskog  [37])  are 

1.  E  follows  a  Multivariate  Normal  distribution. 

2.  E(E)  =  0  V  k  =  1,  2,..., p. 

3.  The  errors  in  E  are  mutually  independent. 

4.  The  elements  of  E  are  mutually  independent  of  the  elements  of  X. 

5.  B  is  lower  triangular  with  zeros  on  the  diagonal.  A  second  way  of  stating  this 
assumption  that  generalizes  to  non-continuous  functions  is  to  state  that  each 
endogenous  variable  is  a  function  only  of  previous  endogenous  and  exogenous 
variables,  that  is,  an  endogenous  variable  is  not  a  function  of  subsequent 
variables. 

6.  The  matrix  I  —  B  is  nonsingular. 
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Note  that  assumptions  3  and  4  above  jointly  imply  that  e^,  the  kth.  element  of  E, 
is  independent  of  Yj  V  j  <  k.  Hence,  the  assumption  of  mutual  independence 
of  the  error  terms,  along  with  the  recursive  nature  of  the  equations,  implies  that 
endogenous  variables  are  independent  of  the  error  terms  in  equations  where 
they  appear  as  predictors,  although  dependent  on  the  entire  set  of  errors.  These 
assumptions  are  a  special  case  of  the  structural  equations  discussed  by  Joreskog 
[37]  where  the  matrix  B  is  lower  triangular.  The  independence  of  errors  and 
endogenous  variables  included  in  the  same  equation  allows  the  use  of  Ordinary 
Least  Squares  estimation  methods  for  consistent  estimation  of  DEs  and,  hence, 
also,  of  IEs,  which  are  products  of  DEs. 

We  rewrite  the  matrix  notation  in  Equation  2.8  as  a  set  of  recursive  model 
equations  as  follows: 

Yx    =    f1(X;ll)  +  e1  (4.1) 

Y2    =    f2(Yi,  X;£2,72)  +  e2  (4.2) 

Y3    =    h(Yu  Y2,  X;^3,73)  +  e3  (4.3) 

Vi    =    fP-i(Y1,Y2,...,Yp.2,X-Pp_vlp_i)  +  ep_1  (4.4) 

Yp    =    fP(YuY2,...,Yp_uX;pp,lp)  +  ep,  (4.5) 

where  each  /3^,  ik  and  ek  represent  the  ith  row  of  B,  T  and  E,  respectively,  in  the 
matrix  equation  above,  and  where 
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/2(n,X;£2,72)    =    fao  +  foiYi+^X  (4.6) 

fz(YuY2,X;^3,j3)    =    Pao+faiYt  +  faYi  +  ^X        (4.7) 

/^i(lri,ya,...,^2,X;^_1,2p_1)    =    /Vi,o  +  /Vi,in  +  /Vi,2>2      (4.8) 

+  ...  +  ^p_1,p_2rp_2  +  7;_ix 

/p(yx,  Y2,  ...,yp_i,  X;^,Ip)    =    ^po  +  ^i^i+^n+^n       (4.9) 

+  ...  +  /?p,p_1rp_1  +  7px. 

Note  that,  by  substituting  Yi,  F2,  . . . ,  yp_!  with  their  associated  functional 
notation,  we  can  re-write  fp  and,  consequently,  Yp  as  a  multivariate  compound 
function  of  Yx,  Y2,  . . . ,  Yp_i,  #,,  A,,  •  •  • ,  £p,  £i,  £2,  •  •  • ,  eP,  as  well  as  X  and 

12'  •  •  • '  T  -v  1  '  ^nat  *s'  we  can  WI"ite 


^  =  /,m,/2(yi;&)+c3,...l/p_i(y1,y2,...>yF_a;^_1)+^_1;^j+^.  (4.10) 

where  we  have  suppressed  the  notation  indicating  that  fk  is  also  a  function  of  X 
and  7^,  for  each  k  =  1,  2,  . . . ,  p. 

Likewise,  with  recursive  substitution,  Yp  can  be  written  solely  as  a  function  of 
Yi,  P2,  §3,  ■••■,  P„  and  e2,  e3,  . . . ,  ep.  We  have 


YP    =    fP{Y1,f2(Yl]p2)  +  e2,MYuf2(Y1;P2)  +  e2}  +  e3,...,  (4.11) 

/p-i[Yi,  /2(Yi;  P2)  +  e2,  ...}+  ep_!}  +  ep  . 

4.1.2     Recursive  Substitution  and  the  "Calculus  of  Coefficients" 

From  Equation  4.11  and  Equations  4.7-4.9,  we  can  write  the  following: 
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YP    =    fo  +  faYi+faifo  +  fcK+^X  +  Ci) 

+A*[fto  +  PaiYi  +  p32(p2o  +  AiKi  +  T^X  +  e2)  +  ^X  +  e3]  +  . . . 
+/5p,p_1{/?p_1,o  +  Pp-i^Yy  +  /fi^_il3(Ao  +  &iH  +  7'2X  +  eg) 
+/5P-i,3[/?3o  +  An^i  +  &2(&o  +  /321yx  +  2;X  +  c2)  +  ^X  +  e3]  +  . . . 
+Pp-i,P-2\Pp-2,o  +  PP-2,iYt  +  ■■■+  ^X  +  ep_!]}  +  7JX  +  ep  . 

Similarly,  Fp_i,  Yp_2,  ■  •  ■ ,  Y2  can  be  written  as  functions  of  the  previous  F's, 
X's  and  e's.  This  form  of  the  model  for  Yp  can  be  used  to  derive  the  COC  for 
partitioning  the  total  effect  of  Yx  on  Yp  into  the  direct  and  indirect  effects. 

For  the  purpose  of  conveying  concepts  in  a  simple  yet  generalizable  context, 
consider  a  linear  model  with  four  endogenous  variables,  that  is,  p  =  4,  and  one 
exogenous  variable.  Then,  the  set  of  structural  equations,  obtained  from  the 
general  linear  system  defined  by  Equations  4.2-4.9,  is: 

Yx  =  7lX  +  ei  (4.12) 

Yi  =  &0  +  PnYi  +  72X  +  e2  (4.13) 

Y3  =  P30  +  P31Y1  +  p32Y2  +  7s*  +  e3  (4.14) 

Y4  =  p4o  +  p4iYr  +  P42Y2  +  p43Y3  +  j4X  +  e4  .  (4.15) 

Substitution  of  the  right-hand  side  of  Equation  4.2  into  Y2  in  Equation  4.3 
yields 

Y3  =  #»o  +  P31Y1  +  PMo  +  faYi  +  72*  +  ca)  +  7s*  +  e3  .  (4.16) 

Now,  taking  a  conditional  expectation  of  Y3  given  Yx  =  yx  and  X  =  x  gives 
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£(W  =  yi,X  =  x)=  A,o  +  A,2/32o  +  (fti  +  /332/32i)2/i  +  (/?3272  +  lz)x .      (4.17) 

Thus,  intuitively,  the  DE  of  Y1  on  Y3  is  represented  by  /331.  The  IE  of  Yx  on  F3 
through  variable  Y2  is  represented  by  (332fi2i ,  which  is  actually  the  product  of  the 
DE  of  Yi  onto  Y2  (/321)  and  the  DE  of  Y2  onto  Y3  (/332).  The  TE  of  Yx  on  F3,  which 
is  the  effect  of  Yx  on  Y3  ignoring  the  intermediate  variable  Y2,  is  then  represented 
by  the  quantity  /331  +  fi&foi,  or,  equivalently,  by  the  sum  of  the  direct  and  indirect 
effect  as  shown  above.  This  TE  quantity  is  easily  and  generally  derived  by  the 
above  substitution  method. 

Here,  we  have  illustrated  the  effects  of  an  intermediate  variable  on  another 
intermediate  variable  occuring  later  in  the  causal  chain.  We  do  this  by  conditioning 
on  the  variables  previous  to  the  first  intermediate  variable  and  ignoring  the 
variables  posterior  to  the  two  intermediate  variables  of  interest  in  this  recursive 
model.  This  method  holds  true  for  any  two  variables  in  the  causal  chain. 

4.1.3    Effects  Defined  as  Derivatives 

Note  that,  using  a  derivative  calculation  on  Equation  4.17  above,  the  TE  can 
be  derived  as 

dEjYM  =  yu  X  =  x)       df3(yuf2  +  e2) 

dy\ dy~x =  &1  +  &2&i  •  (4-18) 

where  /3  is  the  linear  mean  function  expressed  in  Equation  4.3,  with  the  function 

f2  included  via  substitution  for  Y2.  Henceforth,  we  will  write 


dE{Y3\Yx  =  Vl,  X  =  x)      ^ 

-z =  i  jb3i 

oyi 
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to  emphasize  that  this  derivative  is  the  TE  of  Yx  on  Y3,  when  holding  X  constant 
at  x.  For  convenience  and  simplification,  the  conditional  expectation  .E^V^Yi  = 
yi,  X  =  x)  will,  at  times,  be  written  E(Y3\y\,  x). 

Also,  it  should  be  noted  that  Equation  4.16  is  a  linear  structural  model  in 
variables  Yi  and  Y3.  Consequently,  the  coefficient  03i  +  03202i  measures  the  effect 
of  a  one  unit  change  in  Yi  on  the  conditional  mean  function  of  Y3,  given  Yi,  while 
holding  X  constant.  Accordingly,  this  quantity  translates  to  the  "total  effect"  of 
Yi  on  Y3.  Likewise,  the  quantity  02i  is  the  coefficient  for  Yi  in  the  linear  model 
involving  the  variables,  Yi  and  Y2,  as  in  Equation  4.2  above.  So,  02i  measures 
the  DE  of  Yi  on  Y2  while  holding  X  constant.  Again,  the  DE  of  Y\  on  Y2  can  be 
derived  as 

dE(Y2\yux;p2,  72)       fl 

a —  P21  ■ 

oyi 
Correspondingly,  the  DE  of  Y2  on  Y3  is  the  quantity  032.  This  value  can  be 
derived  as 


dE(Y3\yu  2/2,^5^3,73) 


=  &2, 


dy2 
and  measures  the  DE  of  Y2  on  Y3  while  holding  Yx  and  X  constant.  The  product  of 

these  two  quantities  is  defined  to  be  the  IE  of  Yx  on  Y3  through  Y2,  that  is,  /^/^l- 

Likewise,  the  DE  of  Yx  on  Y3  is  derived  as 

9E(Y3\yu  y2,  x;  §_3,  73) 
oy\ 
and  measures  the  DE  of  Yx  on  Y3  while  holding  X  constant.  Therefore,  total,  direct 
and  indirect  effects  can  be  defined  as  derivatives  and  products  of  derivatives. 
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4.1.4    The  Multivariable  Chain  Rule  and  the  "Calculus  of  Coefficients" 

In  this  section  we  show  that  the  COC  derived  above  by  recursive  substitution 
can  also  be  derived  by  the  MVCR.  At  times,  for  simplicity  of  notation  and  conve- 
nience, we  will  suppress  the  full  notation  that  expresses  each  fk  as  a  function  of 
previous  F's,  x,  f3_  and  7,  writing  the  fk(Yi,  Y2,  . . . ,  Yk-i,  ^',§j!,  7^.)  m  simplified 
form. 

We  write  the  TE  of  Yx  on  Y3  as 


dE(Y3\yu  x) 

1  "^31      =      ^ 

ayx 

=■  -r-{Et2,M(yu  h  +  £2)  +  €3] 

dyi 

=    ^ — [/3(2/i,  h)] 
oy\ 

=    -i—{Pm  +  /?3i2/i  +  A?2  •  h  +  73^} 
dy\ 

because  the  errors  enter  linearly  and  have  expected  values  of  zero. 
Applying  the  MVCR,  we  have 


rpjp  df3      df3    df2 

TE31    =    —  +  —  •  —  4.19) 

oyi      3/2    dyi 

=    fti+&2-^  (4.20) 

dyi 

=    Asi+/532-7r-(A!o+/W+72z)  (4.21) 

oyi 

=    An  +  A52&1 ,  (4.22) 

which  is  the  same  as  the  result  obtained  in  Equation  4.18  above  by  taking  the 
derivative  directly.  Note  that  in  this  derivation,  we  take  the  expectation  before 
taking  the  derivative. 
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An  identical  result  can  be  achieved  in  a  third  manner  by  using  rules  of  iterated 
conditional  expectations  (Mood  et  al.  [50]).  This  partitioning  of  the  TE  of  Yx  on 
Y3  into  direct  and  indirect  effects  can  be  derived  in  a  generalizable  way  as  follows. 
Without  loss  of  generality,  we  assume  no  exogenous  variables  and  write 

E(Y3\m)    =    EY2]Yl[EY3lYuY2(Y3\yuy2)]  (4.23) 

=    EY2lYl[f3(yuY2)}  (4.24) 

=    Ef2lYl{f3[yuf2(yi)  +  e2}}.  (4.25) 

Now,  specifically  applied  to  Equations  4.2-4.15  which  are  linear,  this  expectation  is 

E{Y3\Vl)  =  p30  +  ftiJ/i  +  fo2Ey2\Yl(Y2\yi), 

where  EY2\Yl{Y2\yi)  =  f2(yi)  =  fa  +  foiVi- 
Recalling  that 

dE(Y3\yi) 

1  £31  = j , 

dyx 

and,  by  substituting  Equation  4.23  for  E(Y3\y1),  we  have 

™P         dEe2\Yl[f3(yi,  /2  +  C2)] 

J  -&31  = . 

dyi 

It  follows  from  model  assumptions  three  and  four  in  Section  4.1.1  above  (specifi- 
cally, the  errors  terms  are  mutually  independent  and  the  error  terms  are  indepen- 
dent of  the  exogenous  variables)  that  the  distribution  of  e2  given  Yx  is  independent 
of  the  given  value  of  Yx.  Accordingly,  the  derivative  and  expectation  in  TE3l  can 
be  interchanged  and 


Then,  by  applying  the  MVCR  while  still  conditioning  on  X  =  x ,  we  have  that 
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TEU    =    En\^  +  d-§j^dJl)  (4.27) 

dyi       d{f2  +  e2)    dyi 

F  rdf3]      „  rdfc(f2  +  e2)]  df2 

-  E^]+EiA    df2    W  (4-28) 

In  this  derivation,  we  have  taken  the  derivative  before  the  expectation  and 
used  the  fact  that,  by  the  Chain  Rule,  we  have 


df3(f2  +  e2)  dfz        d(f2  +  e2) 


df2  d(f2  +  e2)        df2 


d(f2  +  e2) 
Now,  for  the  linear  f2  and  /3  specified  in  Equations  4.13  and  4.14,  we  have 


(4.30) 


that 


dyi        dyi 


and,  also, 


Furthermore, 


and 


d_h_ 

« —  —  ^3i 

oyi 


„  jdfsjyi,  /2Q/i)  +  e2)1      a 
E"[ dh ]=/?32 


dh_ 
dyi 


Thus,  for  this  linear  system,  we  can  write  the  TE  in  Equation  4.28  above  as 

TE3i  =  /53i  +  /332/?2i , 
which  equals  the  quantity  derived  in  both  Equations  4.18  and  4.19-4.22  above. 
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In  this  application,  we  see  the  "Calculus  of  Coefficients"  methodology  im- 
plemented, as  previously  discussed.  One  should  note  that  using  the  notion  of 
derivatives,  either  through  the  partial  derivative  methodology  shown  in  Section 
4.1.3  or  by  using  the  MVCR,  as  seen  in  the  current  section,  yields  the  same  results 
as  those  obtained  through  recursive  substitution. 

More  importantly,  we  see  that  the  TE  of  Yx  on  Y3,  that  is,  the  value  of 
— ~ifa-  U    ,  is  equal  to  the  sum  of  the  direct  and  indirect  effects,  respectively,  /331 
and  /?32/#2i .  Therefore,  we  see  that  the  TE  is  the  sum  of  these  two  values  by  theory 
and  derivation,  not  just  by  definition,  as  is  commonly  done  in  previous  literature 
(for  example,  Wright  [74]  and  Freedman  et  al.  [15]).  That  is,  if  we  define  the  TE  of 
Y\  on  y3  to  be 

„„         dEY3\Yl(Yz\yi,  x) 
dyi 

then  by  the  MVCR,  we  show  that  the  TE  is  the  sum  of  a  DE  and  an  IE.  Explicitly, 
TE  =  DE  +  IE,  where 

and 


IB    =    *.#]•&. 

oh     dyi 

As  will  be  discussed  in  this  dissertation,  this  partitioning  of  the  TE  into  sums  of 
the  DE  and  all  appropriate  IEs  extends  to  differentiable  nonlinear  functions  on 
continuous  domains. 
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4.1.5    Higher  Order  Indirect  Effects 

Previously,  we  derived  first-order  indirect  effects.  When  there  is  more  than 
one  intermediate  variable  in  the  causal  chain,  there  are  other  IEs  to  consider.  For 
example,  when  considering  the  effects  of  Y\  on  F4,  we  have  three  IEs  that  occur. 
In  this  section  we  show,  using  the  MVCR  method,  how  such  higher  order  IEs  are 
included  in  the  partitioning  of  TEs. 

We  also  can  use  the  MVCR  recursively  as  in  Equation  3.4  above  to  partition 
the  TE.  In  the  four  variable  example  above,  for  example,  the  TE  of  Y\  on  Y4,  that 
is,  — d*      ,  can  be  partitioned  as 

dEjYM    =    df±  +  df±df1  +  dhdh 
dy\  dyx      df2  dyx      df3  yx 

by  an  initial  application  of  the  MVCR.  But,  by  a  second  application  of  the  MVCR 
to  /3,  we  can  write 


fa=dj3_  +  df3_df2_  (432) 

dyi       dyx      df2  dyx  ' 

Finally  by  substitution  of  Equation  4.32  into  Equation  4.31, 


TEn    _    d^XM  (4.33) 

dyi 

dU  +  df±dJ1  +  df±{dh  +  dhdh)  (4  M) 

dyi      df2dyx      dy3  dyi      df2dyx 

dJ±  +  dUdh+dMdJ1  +  dUdf1df1 

dyi      df2dyi      dy3dyi      dyzdfidyx 

which  gives  an  expansion  for  TEn  that  is  completely  expressed  in  terms  of  all 
direct  and  indirect  effects  in  the  causal  chain  linking  Yi  through  YA.  This  allows 
each  TE  of  a  variable  in  the  chain  to  be  expressed  as  the  sum  of  the  DE  of  the 
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variable  of  interest  and  IEs  of  the  variable  of  interest  through  all  intermediate 
paths  in  the  chain. 

More  explicitly,  considering  the  example  directly  above,  the  TE  of  Y\  on  Y4, 
or  TEn ,  can  be  written  as  the  sum  of  DE41  and  all  associated  IEs,  here  IEi2i  = 
the  first-order  indirect  effect  of  Y\  on  Y4  through  Y2  and  IE^i  =  the  first-order 
indirect  effect  of  Y\  on  V4  through  Y3.  Similarly,  IE4321  =the  second-order  indirect 
effect  of  Yi  on  Y4  through  Y2  and  then  through  Y3.  This  method  provides  an 
alternative  derivation  of  the  "Calculus  of  Coefficients"  (COC).  Thus,  the  MVCR  is 
important  to  the  methods  of  classical  path  analysis  because,  if  all  variables  in  the 
system  of  equations  are  continuous  and  linearly  related,  the  COC  results  can  be 
obtained  either  by  recursive  substitution  method  or  by  use  of  the  MVCR  applied  to 
appropriate  conditional  expectations. 

One  should  note  that  in  each  of  the  derivations  above  the  error  terms  dis- 
appear when  taking  expectations,  due  to  the  assumption  of  a  classical  linear 
model.  This  is  not  the  case,  as  we  will  see  later,  when  considering  models  that  are 
nonlinear  or  fail  to  follow  the  classical  assumptions. 

4.2     The  General  Path  Model 
For  more  general  cases  than  those  discussed  above,  let  Yi,  Y2,  . . . ,  Yp  be  a 
sequence  of  causally  ordered  random  variables.  Consider  the  set  of  structural 
equations  shown  in  Equations  4.2-4.5  that  define  the  path  model  describing  the 
inter-relationships  among  these  variables,  omitting  exogenous  variables,  without 
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loss  of  generality.  That  is,  recall  the  system: 


Yi    =    fii  +  €,  (4.36) 

Y2    =    f2(Y1;^2)  +  e2  (4.37) 

Fp_!    =    fP-i(Y1,Y2,...,Yp_2;ppl)  +  ep_1  (4.38) 

Yp    =    f,{YllY2,...,Yp-1;§J  +  cp  (4.39) 

where  the  Vjt's,  fc  =  1,  2,  . . . ,  p,  are  not  necessarily  continuous  variables  and  the 
fk  functions  are  not  necessarily  linear  functions.  In  the  system  of  equations  given 
immediately  above,  the  error  terms,  e2  through  ep,  are  not  necessarily  independent 
and,  hence,  not  necessarily  independent  of  previous  V's  in  the  general  model. 
Also,  these  error  terms  are  not  necessarily  independent  of  exogenous  variables  that 
may  be  in  the  system  of  equations.  More  concisely,  only  assumptions  two  and  five 
listed  in  Section  4.1.1  are  required  to  hold  in  the  general  path  model.  That  is,  the 
error  terms  are  assumed  to  have  expectation  of  zero  conditional  on  all  previous 
endogenous  variables  in  the  causal  system  and  each  variable  has  a  structural 
relationship  with  only  the  variables  that  occur  previously  in  the  causal  chain. 

The  development  of  an  analog  to  the  COC  for  this  model  is  a  main  goal  of 
this  dissertation.  To  this  end,  we  develop  an  analog  to  the  COC  for  continuous 
variables  in  Chapter  5.  In  Chapter  6,  we  develop  an  analog  for  dichotomous 
variable  models. 


CHAPTER  5 
NONLINEAR  PATH  MODELS  WITH  CONTINUOUS  VARIABLES 

With  the  generalized  view  of  the  COC  for  classical  path  models  described 
above  in  Chapter  4,  we  can  analyze  not  only  the  linear  models  previously  men- 
tioned, but  nonlinear  models  as  well.  In  this  chapter,  the  COC  is  generalized  to 
a  class  of  nonlinear  models.  The  generalized  result  will  be  called  the  Calculus  of 
Effects  (COE). 

In  this  chapter,  we  consider  the  general  model  from  Section  4.2  with  continu- 
ous endogenous  variables.  That  is,  the  Yk  variables,  k  =  1,  2,  . . . ,  p,  are  continuous. 
Section  5.1  uses  a  specific  model  as  an  example  to  introduce  the  nonlinear  path 
model.  Section  5.2  deals  with  estimation  issues  that  arise  in  these  nonlinear  mod- 
els, illustrated  by  the  example  given  in  Section  5.1.  In  subsequent  sections  a  four 
variable  model  is  discussed,  as  well  as  the  p  variable  model,  both  under  the  classical 
assumptions,  and  formal  definitions  of  effects  are  given.  In  Section  5.6,  models  with 
endogenous  variables  that  have  conditional  exponential  distributions  are  examined. 

5.1     An  Introductory  Example 
Consider  the  example  of  a  nonlinear  path  model  presented  in  Chapter  1, 
Equations  1.1-1.3.  Note  that  we  can  also  express  the  Y  variables  as 

Yi    =    pi  +  ei 

Y2    =    /2(n,£'B)  +  e2 

Y*  =  /,(ii,n,/y+* 
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where  /2  and  /3  are  nonlinear  functions,  specified  in  Equations  1.2  and  1.3, 
0    =  (/31?  /32)  a)  and  /?'    =  (/33,  /54,  /?5,  /36,  /37).  Also,  note  that  the  model  for 
variable  Y2  is  a  modification  of  the  Gamma  density  function  (Mood  et  al.  [50]). 
The  model  for  Y"3  is  the  sum  of  two  first  order  reaction  curves,  also  known  as 
Mitscherlich's  Law  (Snedecor  and  Cochran  [61]). 

For  this  example,  we  make  the  assumptions  that  the  error  terms,  ei,  e2  and 
e3,  have  zero  expectation,  conditional  on  all  previous  variables  in  the  causal  chain, 
and  that  the  error  terms  are  mutually  independent.  We  make  the  additional 
assumptions  that  each  of  the  fk  functions,  k  =  1,  2,  3,  is  continuously  differentiate 
with  respect  to  previous  endogenous  variables.  Note  that  each  Y*  has  a  structural 
relationship  exclusively  with  the  variables  occurring  previously  in  the  causal  chain. 
(Henceforth,  we  refer  to  this  type  of  structural  relationship  as  strictly  ordered.) 
Also,  the  density  function  for  each  Y^  is  differentiable  everywhere  with  continuous 
derivative. 

From  methodology  presented  in  Section  4.1.4,  where  TEs  are  studied  as 
derivatives  of  conditional  expectations,  we  must  find  dEv3\vv  to  study  the  total 
effect  of  variable  Yx  on  variable  y3.  (Note  that  this  definition  of  TE  will  be 
formalized  in  Section  5.4.)  From  substitution  of  Y2,  as  a  function  of  Yi  =  2/1,  into 
y3  we  see  that 


Y3  =  &  +  A  expGfcYi)  +  &  exp[/37(A  +  7^^" *  «p(-£)  +  e2)}  +  e3. 

P2  *  \a)  P2 

Hence,  we  have  that 


E(Y3\Y1  =  y1)    =    &+ A,exp(/?52/l)  (5.1) 

+iw&  exp[/Mft  +  ^r^r1  exp(-|) + e2)]i^i  =  vx}  ■ 
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Note  that  the  c3  term  drops  out,  because  it  has  a  conditional  expectation  of 
zero  and  enters  linearly  in  the  equation  for  F3.  The  e2  term,  however,  enters  the 
equation  exponentially.  Thus,  unlike  the  case  with  linear  models,  this  term  does 
not  drop  out  of  the  equation  for  £,(y3|yi  =  yx).  Therefore,  the  expectation  involved 
in  Equation  5.1  results  in  evaluating 

E{Y3\Y1  =  yl)    =    fo  +  faexp(P6yi)+  (5.2) 

PeexplfriPi  +  ^T-T^"1  exp(-£)]  •  ^{[expC^)]^  =  Vl} 

P2  l  \a)  P2 

=    A  +  Aexp(ftyi)  (5.3) 

+/% exp[/57(A  +  ^f^jtf-1  exP(-f  M  ■  B+Wfa)], 

because  e2  is  independent  of  Y\,  a  fact  that  follows  from  the  assumption  that  the  e\, 
e2  and  e3  are  mutually  independent. 

Now,  taking  the  derivative  with  respect  to  yx,  we  obtain  the  TE  of  Y\  on  Y3. 
That  is, 

dBQWi  =  2/1) 


TEn    = 


dyi 
P^e?™  (5.4) 

+ftfre*  Ata'ft.Jl/afa,  P')  -  ft]  [(a  -  l)y?  -  ft"1]  Et3(e?™). 


Note  that  the  quantity  immediately  above  can  also  be  derived  via  the  MVCR  as 
follows: 


TE^    =    dE(Y3\Y1  =  yi) 


dy\ 

oy\  dy2  dyx 


where 
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=    ^6/57e07[/2(!/1,^)+£21 
and 

g  =  (/2-^)[(a- 1)^-^-1], 

Thus,  taking  the  expectation,  we  have  that 

TE31  =  frfae?™  +  Pefoe^h(f2  -  ft)  [(a  -  l)y^  -  fc1]  E^e**)  (5.5) 

which  equals  Equation  5.4.  This  result  is  analogous  to  the  COC  partitioning, 
except  with  average  DE  and  average  IEs  substituted  in  place  of  DE  and  IEs  in 
classical  linear  path  models.  Note  that  the  partitioning  of  TE3i  immediately  above 
depends  on  the  assumption  of  independent  error  terms. 

In  order  to  estimate  the  direct  and  indirect  effects  in  the  above  TE,  we 
must  estimate  each  piece  of  the  quantity  immediately  above.  Thus,  the  quantity 
£l[exp(/37e2)]  must  be  estimated,  in  addition  to  estimating  all  parameters.  There- 
fore, in  contrast  to  linear  models,  estimating  the  model  coefficients  is  not  enough 
to  estimate  the  direct  and  indirect  effects  comprising  the  total  effect.  Methodology 
to  estimate  expectations  that  arise  in  systems  of  nonlinear  equations  will  be  more 
fully  discussed  and  developed  in  the  following  section. 

5.2     Monte  Carlo  Estimation  of  Expectations 
To  estimate  the  expectations  of  direct  and  indirect  effects  in  TE  quantities,  we 
must  first  obtain  estimates  of  the  /?fc  parameters  within  the  system  of  structural 
equations.  Estimates  of  these  /^  parameters  can  be  obtained  by  separately  fitting 
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each  individual  equation  within  the  system  of  equations.  The  Nonlinear  Least 
Squares  (NLS)  estimates  of  each  equation  can  be  obtained  using  widely  available 
software,  such  as  PROC  NLIN  [58],  of  the  Statistical  Analysis  System  (SAS)  [57]. 

Using  the  estimated  values  of  the  the  individual  B    parameters,  that  is, 
using  the  /?  ,  any  quantities  involving  expectations  of  nonlinear  functions  of 
the  tk  parameters  (such  as  in  Equation  5.5)  must  be  estimated.  When  the  Cfc 
are  independent,  each  expectation  involving  a  particular  efc  can  be  estimated 
separately.  Estimation  of  these  expectations  can  be  achieved  via  Monte  Carlo  (MC) 
integration  techniques. 

MC  integration  is  an  approach  that  can  be  used  to  estimate  expectations 
(Gilks  et  al.  [20]).  The  MC  integration  technique  draws  samples  from  the  necessary 
distribution  and  then  forms  sample  averages  which  are  used  to  approximate 
expectations.  More  specifically,  MC  integration  evaluates  E[g(X)]  by  sampling  n 
items,  {Xt ,  t  =  1,  ...,  n},  from  the  distribution  of  X.  The  value  of  E[g(X)]  is  then 
approximated  by  jj  X^P(^t)-  By  increasing  the  sample  size  n,  this  approximation 
can  be  made  as  accurate  as  desired. 

As  an  example,  consider  the  case  where  Xt  is  normally  distributed  with 
mean  zero  and  variance  a2.  That  is,  Xt  follows  the  N(0,  o-2)  distribution.  If  a2  is 
unknown,  an  independent  estimate  of  a2,  say  a2,  must  be  obtained.  Then,  we  can 
sample  items  from  the  iV(0,  <x2)  distribution  to  estimate  E[g(X)].  An  example  of 
this  technique  is  presented  in  Sections  8.1.3  and  8.1.4. 

To  estimate  the  direct  and  indirect  effects  in  TE3l  in  Equation  5.5,  we  first 
estimate  a,  ft,  A>  and  a22  by  fitting  Equation  1.2  and  #,,  At,  As ,  As  and  (37  by 
fitting  Equation  1.3.  Then  the  Ee2[eP7e2]  is  estimated  by  generating  samples  of  e2 
from  a  N(0,  a22)  distribution  and  averaging  the  resulting  e^7£2i,  i  =  1,  2,  . . . ,  n. 
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5.3    A  Four  Variable  Model 
Now,  consider  the  general  four  variable  model,  as  given  below  (exogenous 
variables  are  omitted  from  this  discussion  without  loss  of  generality): 

Yx  =  m+€1  (5.6) 

Y2  =  /2(n)  +  c2  (5.7) 

Yz  =  h{Y1,Y2)+e3  (5.8) 

n  =  A(n,  Yit  Y3)  +  e4,  (5.9) 

where  the  Y*  variables  are  continuous,  the  ff.  are  unspecified  continuously  dif- 
ferentiate nonlinear  functions  and  the  e^  have  conditional  expectation  of  zero 
given  preceding  endogenous  variables  in  the  causal  chain.  Note  that  we  are  not  yet 
making  the  assumption  of  mutual  independence  of  the  error  terms  in  this  model. 

5.3.1     Total  Effect  of  yj  on  Y4 

To  estimate  the  TE  of  Yx  on  Y4,  we  must  estimate  dE^Y^=yi) .  Following  the 
derivation  presented  using  Equations  4.23  -  4.26  in  Section  4.1.4  for  the  classical 
linear  model,  we  write 

£Wi=yi)    =    EyiMyAEymmMYi  =  n,  Y2,  Y3)} 
=    EY2,Y3]Yl[f4(yuY2,Y3)} 

=      Et2,e3\Yi{Myi,  /2(j/l)+C2,  fs(Vl,  flit/l)  +  C2)  +  €3]}  • 

Note  again  that  currently,  in  this  system,  we  only  assume  that  the  errors  have 
zero  conditional  expectation  and  that  each  variable  has  a  structural  relationship 
exclusively  with  the  variables  occurring  previously  in  the  causal  chain.  We  do 
not  assume  at  this  point  that  the  errors  are  independent  as  in  the  classical  linear 
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model.  Hence,  the  error  term  in  any  given  equation  is  not  guaranteed  to  be 
independent  of  endogenous  variables  also  occurring  in  that  equation.  Thus,  we 
cannot  interchange  the  derivative  and  the  expectation  in  the  formulation  of 

dEjY^  =  Vl)       dEtatt3\Yl{f4[Vl,  f2{yi)  +  e2,  f3(yu  f2(yi)  +  e2)  +e3]} 

1  £41  = = 

dyi  dyi 

However,  with  no  further  assumptions  made,  the  following  derivation  holds: 


TEtl  .  ^XM  (5.10) 

dyi 
=   ijE^yAU)-  (5.11) 

Therefore,  we  can  write 


TE41  =  j-   /      /    f*  •  gea,ta\Yifo,  ^\yi)dt2  dt3.  (5.12) 

Interchanging  the  integrals  and  derivative,  we  have 

TE4l    =  -jr-  ■9e2,i3\Yi (e2,  e3\yi)de2de3  (5.13) 

+  r  r  u .  d9,„„IY(^ «,!»,) 

hi  Jt*  dyx 


where  we  have  suppressed  the  arguments  of  the  function  /4  and  Leibniz's  Rule 
(Section  3.3)  has  been  applied  to  interchange  the  derivative  and  integrals. 

If  we  now  impose  the  classical  assumptions  of  mutual  independence  among  the 
ejt's,  then  the  second  term  in  Equation  5.13  is  zero.  This  term  is  zero  because 


^2,63|yi(e2,  £z\y\)  __  d&2,£3(e2,  £3)  __  Q 


dyi  dy 


51 


due  to  the  fact  that  ge2,e3\Yi{(-2,  ^\yi)  is  not  a  function  of  yi  when  e2  and  e3  are 
independent  of  ei  and,  hence,  of  Y\.  Therefore,  under  the  classical  assumptions  of 
independent  e^,  we  have 


■dl 
dy 

Now,  we  can  implement  the  MVCR  inside  the  expectation  and  write 


TE41    =    EM[Z±].  (5.14) 


TF       -F        \9h  4-  dfi     9f2  +  df4  dfz  1  (*  1  K\ 

1  ha  -  ht2t3[- h  —  • h  -^t~T~  ■  (5.15) 

Noting  that  the  fk  are  continuous  and  differentiable  functions  and  recalling  that,  as 
was  shown  in  Equations  4.29  and  4.30  for  the  case  when  /  —  3  and  k  =  2,  we  can 
write 


%  a</2  +  £2)     %  (     ' 

By  applying  the  MVCR  once  more  to  &L  and  using  Equations  5.16-5.18  and  similar 
results,  we  have 


1  £m    -    ht2it3[- h  —  -= h  —  (- 1--^—  ^— )  (5.19) 

dyi       dy2  dyi       dy3   dyx       dy2  dyxn 


dyi  dy2  dy!  dy3  dyi 

+    ^Ady3dy2dy^ 


(5.20) 


Note  that  the  expression  of  TE4l  given  immediately  above  is  analogous  to 
the  expression  given  in  Equation  4.35  which  is  the  COC  in  classical  linear  models. 
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The  difference  between  the  two  expressions  is  merely  the  expectation  on  the  right- 
hand  side  of  Equation  5.20,  which  is  required  in  nonlinear  path  models.  Thus,  to 
estimate  the  quantity  TE41,  as  well  as  the  associated  direct  and  indirect  effects,  for 
models  with  mutually  independent  errors,  the  expectations  on  the  right-hand  side 
of  Equation  5.20  must  be  estimated.  These  expectations  generally  contain  e2  and 
€3  in  nonlinear  functions  and,  therefore,  may  require  MC  estimation  as  shown  in 
Section  5.2. 

Note  that  without  the  assumption  of  independent  errors  required  for  the 
quantity     'a'?1^  e2<e3vi    ^0  ^e  zer0)  inciusion  0f  the  second  term  in  Equation  5.13  is 
required  and  this  analog  for  the  COC  does  not  hold. 

5.3.2    Total  Effect  of  Y2  on  Y4 

To  preview  the  concept  of  a  general  conditional  total  effect,  we  present  a  brief 
example  of  such  an  effect  under  the  assumption  of  independent  e^.  We  write  the 
conditional  total  effect  of  Y2  on  F4,  given  Y\  =  t/i,  denoted  TEi2\\,  as 

TE42\i  =  o— EY4\YuYa(Y4\yi,  tot)- 

(This  concept  of  a  conditional  TE  will  be  discussed  further  in  the  following 
section.)  Note  that  this  conditional  TE  can  also  be  written  as 


TE42]1    =    -^EY3{YuY2[EY4lYuY2,Y3(Y4\yi,y2,Y3)}  (5.21) 

Q 

=    {TtEyzVuY, [Atel,  2/2,  Yi)].  (5.22) 

Now,  by  writing  Equation  5.22  as  an  expectation  over  e3  given  Yi  and  Y2  and 
interchanging  the  derivative  and  integral,  which  is  allowed  by  the  assumption  of 
mutual  independence  of  the  error  terms,  we  have 
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TE42\i  =  Et3\Yl,Y2[Q  —  f4(Vl,  2/2,  fs(yi,  2/2)  +cs)]- 

Note  that  f4  is  a  compound  function  of  y2  through  /3.  Hence,  we  can  write 


TE42]l    =    EY3\Yl,Y2 h—  +  «— «—  il  =  2/i>  *2  =  2/2 

which  is  an  expectation  of  the  sum  of  DE42  and  IE432.  Note  that  these  are 
functions  of  the  given  value,  t/i,  of  the  antecedent  endogenous  variable  Yx.  Hence, 
the  term  conditional  total  effect. 

5.4     General  Definitions  of  Effects 
We  now  present  general  definitions  of  total,  direct  and  indirect  effects  that 
are  applicable  generally  to  both  the  classical  linear  path  models  as  well  as  to  the 
nonlinear  models  with  continuous  variables  which  are  studied  in  this  chapter.  We 
examine  the  total,  direct  and  indirect  effects  of  variable  Yk  on  variable  Yi,  for  any  k 
and  /  such  that  1  <  k  <  I  <  p.  For  definitions  and  derivations  of  effects,  we  utilize 
the  following  notation: 

Ak  collectively  denotes  the  subscripts  (or  indices)  of  all  variables  antecedent  to 
Yk  in  the  causal  chain, 

YAk  collectively  represents  all  variables  antecedent  to  Yk  in  the  causal  chain, 
Y/  collectively  represents  all  intermediate  variables  between  Yk  and  Yt,  and 
Y5  collectively  represents  all  variables  subsequent  to  Yt  in  the  causal  chain. 
More  explicitly,  Ak  represents  the  indices  1,  2,  . . . ,  k  -  1  and  YAk  represents 
the  variables  Yx,  Y2,  .  .  .  ,  Yk-.v  Likewise,  Y7  represents  variables  Yk+1,  Yk+2,  . 
.  .  ,  Yj_i  and  Ys  represents  the  variables  Yl+U  Yl+2,  .  .  .  ,  Yp.  Note  that,  when 
considering  the  collective  group  of  variables  denoted  by  Y^,,  we  may,  for  clarity  at 
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times,  also  write  the  group  as  Y^,  YkYt.  Also,  we  let  y^,  yk,  yr  and  y5  denote 
given  values  of  the  Ya,  Yk,  Y/  and  Ys  variables,  respectively. 

Let  TEik  denote  the  total  effect  of  variable  Yk  on  variable  Yj,  for  any  k  and 
I  such  that  1  <  k  <  I  <  p.  Let  TEik\Ak  denote  the  conditional  total  effect  of 
variable  Yk  on  variable  Yj,  conditional  on  variables  Y^.  Also,  let  DEik  denote  the 
conditional  direct  effect  of  Yk  on  Yj  given  YAk  and  Yk. 

Definition  5.4.1.  The  conditional  total  effect  ofYk  onYt,  1  <  k  <  I  <  p,  given 
YAk  =  YAk  and  Yk  —  yk,  in  a  causally  ordered  chain  of  p  variables,  is  denoted  by 
TEik\Ak  and  is  defined  by 

TElk\Ak  =  ^—EYl\YAk,Yk(Yi\YAk  =  YAk,  Yk  =  yk). 

In  the  above  definition  of  conditional  total  effect  when  k  =  1,  there  are 
no  variables  in  the  group  Y^,,  and,  hence,  there  are  no  variables  on  which  to 
condition.  Thus,  in  this  special  case,  the  conditional  total  effect  is  equivalent  to  the 
unconditional  total  effect. 

Note  that,  using  iterated  expectations,  Definition  5.4.1  can  also  be  written  as 


TElk\Ak    =    -q-Ey^-y^yAEyay^y^yAY^Y^  =  yAk,  Yk  =  yk,  Y7  =  y7)] 

=    Q^EY,\YAkiYk[fi(yAk,yk,  Y/)].  (5.23) 

Also,  again  using  iterated  expectations,  we  can  write  this  conditional  total 
effect  as 


d 

TElk\Ak      =      Q^EYs\YAk,Yk[EY,\YAk,Yk,Ys(Yi\yAk,yk,Ys)] 

=    Qy^EYs\YAk,Yk[EYl\YAk,Yk(Yt\yAk)  yk)} 
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because  of  the  assumed  strict  ordering  of  Y\,  Y2,  . . . ,  Yp.  Thus,  the  conditional 
total  effect,  TEik\Ak,  can  be  viewed  as  the  average  TE  of  Yk  on  Y\,  given  Y  Ak 
and  Ys,  averaged  over  Y5  given  YAk  and  Yk.  In  this  case,  where  the  conditional 
expectation  of  Yi  does  not  depend  on  Y5,  averaging  over  Y5  is  equivalent  to 
ignoring  it. 

Definition  5.4.2.  The  conditional  direct  effect  of  Yk  on  Yi,  1  <  k  <  I  <  p,  in 
a  causally  ordered  chain  of  p  variables,  given  YAk  =  yAkand  Yk  =  yk,  is  denoted  by 
DEik  and  is  defined  by 

DEu  =  ^>. 

oyk 

Before  giving  the  general  definition  of  an  indirect  effect  (IE),  we  introduce 
notation  that  is  utilized  throughout  the  remainder  of  this  chapter.  First,  let  Blk 
denote  the  set  of  indices  (or  subscripts)  of  all  intermediate  variables  between  Yk 
and  Yt.  That  is, 

Blk  =  {k  +  1,  k  +  2,  ...,/-  1}. 

The  power  set  of  Blk,  denoted  by  2B<*  (Billingsley  [6]),  is  the  set  of  all  possible 
subsets  of  the  set  Blk.  Note  that  the  empty  set  is  an  element  of  2B,k.  Let  Q  be 
any  arbitrary  element  of  2B'k.  That  is,  let  Q  consist  of  the  set  of  indices  associated 
with  an  arbitrary  subset  of  variables  in  Y7.  Let  the  pair  (k',  I')  denote  a  pair  of 
adjacent  indices  in  the  set  {/}  U  Q  U  {k},  which  is  henceforth  denoted  by  l(Q)k. 
More  specifically,  these  pairs  of  adjacent  indices  represent  the  subscripts  of  adjacent 
intermediate  variables  involved  in  an  IE  representing  the  path  between  Yk  and 
Yi  through  the  endogenous  variables  associated  with  the  indices  in  Q.  As  an 
example,  if  Q  =  {k  +  1,  k  +  2},  then  (k',  I')  represents  (k,  k  +  1),  (k  +  l,k-+  2) 
or  (k  +  2,1),  implying  that  the  IE  under  examination  is  associated  with  the  path 
Yk  ->•  Yk+1  ->■  Yk+2  ->  Yt.  Note  that,  if  Q  =  0,  that  is  Q  is  the  empty  set,  then 
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the  set  of  indices  l(Q)k  is  simply  {k,  /},  and  l(Q)k  is  associated  with  the  DE  of 
Yj;  on  Y[,  as  opposed  to  an  IE.  Utilizing  the  above  notation,  we  define  a  general 
conditional  indirect  effect. 

Definition  5.4.3.   The  conditional  indirect  effect  ofYk  on  Y/  through  an 
arbitrarily  selected  set  of  intermediate  variable(s),  with  associated  set  of  indices  Q, 
given  YAk  =  yAk  and  Y*  =  yk,  is  denoted  IE^Q)k  and  is  defined  by 

IEi(Q)k    =  DEVk> 

(k',l')£l(Q)k 

y-r        dfv(yAl,) 

■*  *■  dyk< 

(k',i')ei(Q)k         yK 

where  (k1,  I')  denotes  pairs  of  adjacent  subscripts  in  the  set  l(Q)k. 

For  example,  consider  the  case  of  k  =  1  and  /  =  5.  Also,  suppose  that  we  are 

interested  in  the  IE  designated  by  IE542i.  Here,  Q  =  {2,  4}  and  (k',  V)  represents 

(1,  2),  (2,  4)  or  (4,  5).  Hence, 


/^542i    =    DE5iDEi2DE21 

df5(yA5)  df4(yAi)  df2(yA2) 
dy4  dy2  dyi 

These  definitions  will  be  used  in  the  next  section  to  derive  an  analog  to  the 
COC  when  the  e*  are  independent.  Henceforth,  the  nomenclature  "conditional 
direct  effect"  and  "conditional  indirect  effect"  will  be  dropped  in  favor  of  simply 
"direct  effect"  and  "indirect  effect" ,  leaving  implicit  the  fact  that  DElk  and  IE^q^ 
are  conditional  on  Y^fc . 
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5.5    The  Calculus  of  Effects  for  Models  with  p  Variables  and  Classical 

Assumptions 

We  now  derive  a  general  analog  to  the  COC  by  extending  results  such  as  those 

in  Chapter  4,  Equation  4.35,  and  those  presented  in  Section  5.3,  Equation  5.20, 

to  nonlinear  models  with  p  endogenous  variables  and  independent  errors.  The 

resulting  analog  is  called  the  Calculus  of  Effects  (COE). 

5.5.1     The  Model  and  Assumptions 
Consider  the  model 


Yi    =    Ao  +  ex  (5.24) 

Y2    =    f2(Y1,P2)  +  e2  (5.25) 

Yp    =    fp(Yu  Y2,  . . . ,  rp_x,  /y  +  ep  (5.26) 

where  the  random  variables  Y\,  Y2,  . . . ,  Yp  are  continuous.  Furthermore,  assume: 

1.  The  error  terms,  e*,  k  =  1,  2,  . . . ,  p,  have  zero  expectation,  conditional  on  all 
previous  variables  in  the  causal  chain. 

2.  The  error  terms  are  mutually  independent  and  e^,  i  =  1,  2,  . . . ,  n,  are 
independent  and  identically  distributed  with  conditional  variance  a\. 

3.  The  Y^  variables  have  conditional  mean  functions,  /fc,  that  have  continuous 
partial  derivatives. 

4.  The  density  functions  of  the  Yk  are  differentiable  everywhere  with  continuous 
derivatives. 

Note  that  this  model  specification  amounts,  in  part,  to  the  assumption  that  each 
random  variable,  Yk,  k  =  1,  2,  . . . ,  p,  has  a  structural  relationship  exclusively  with 
the  variables  occurring  before  it  in  the  causal  chain.  Henceforth,  we  refer  to  this 
specific  type  of  structural  relationship  as  strictly  ordered.  In  nonlinear  path  models, 
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this  structural  ordering  of  the  variables  within  the  causal  chain  is  analogous  to 
the  triangular  B  matrix  in  linear,  recursive  path  models,  as  presented  in  Section 
4.1.1.  Note  also  that,  if  the  ek  terms  are  assumed  to  be  normally  distributed,  then 
Assumption  1  above  along  with  the  strict  ordering  assumption,  implies  Assumption 

2. 

5.5.2     Calculus  of  Effects  Partitioning  of  Total  Effects 

We  now  show  that  conditional  total  effects  can  be  partitioned  into  expected 
direct  and  indirect  effects  in  a  way  analogous  to  the  COC  in  linear  models.  The 
partitioning  is  called  the  Calculus  of  Effects  (COE). 

Theorem  5.5.1.  (Calculus  of  Effects)  Given  the  p-variable  path  model  defined  by 
Equations  5.24  -  5.26,  and  assumptions  1  -  4,  the  conditional  total  effect  ofYk  on 
Yi  given  YAk  =  yAk  and  Yk  =  yk,  that  is,  TElk\Ak  in  Definition  5.4.1,  for  I  and  k 
arbitrarily  chosen  such  that  1  <  k  <  I  <  p,  is  the  sum  of  the  expectations  of  direct 
and  indirect  effects  with  respect  to  the  density  of  e'7  =  e^+i,  ek+2,  . . . ,  e;_i.   That  is, 

TElklAk    =    E[DElk}+     J2      ElIEHQ)k]  (5-27) 

Qe2Bik-$ 

-  <i+  e  *  n  ^i.     (^8) 

Qe2Bik-®       (k',i')ei(Q)k        yk 
where  (k1,  I')  denotes  pairs  of  adjacent  subscripts  in  the  set  l(Q)k. 

Proof.  Following  the  form  of  Definition  5.4.1  above,  we  write  the  conditional  total 
effect,  TElk]Ak,  as 


TElk\Ak      =      -^-kEYl\YAk,Yk(Yl\yAk,yk)  (5.29) 

=    Q^EY,\YAh,YMyAk,Vk,Yr)]  (5.30) 


59 

by  the  laws  of  conditional  expectations  and  the  facts  that  fi(yAk,  yk,  Y/)  = 
EY,\YAk,Yk,Y,(Yi\yAk,  Vk,  Y/)  and,  by  assumption,  E(et\yAk,  yk,  Y7)  =  0.  Now,  by 
writing  ft  as  a  compound  function  of  yAk,  yk  and  the  elements  of  e'7,  we  have 

TElk\Ak    =    —  E^\YAh,Yh{fi[yAk,  Vk,  fk+i&Ak,  Vk)  +  ek+i,  (5.31) 

fk+2(yAk,  Vk,  fk+l(yAk,  Vk)  +  £fc+l)  +  6k+2, 

-  -  •  fl-l{yAk,  Vk,  fk+1  +  Cfc+1)  •  •  •  , 

fl-2{yAk,  Vk,  fk+1  +  €fc+l>  •  •  •  ,  /,C_3  +  £i_3)  +  C(_2)  +  €l_i]}, 

where  we  utilize  the  shorthand  notation  of  /f_3  to  denote  the  fact  that  /*_3  is  the 
compound  function  of  yk,  yAk  and  efc+1,  efc+2,  ...,  e;_4  obtained  through  recursive 
substitution  of  yk  and  y^  into  Yk+i,  Yk+2,  ■  ■■,  Vj_4,  the  arguments  of  /;_3.  More 
generally,  let 


fi    -  MyAk,  yk,  fk+i(yAk,  Vk)  +  Cfc+i,  fk+2(yAk,  yk,  A+i  +  e/t+i)  +  tk+2,  ■■■, 
fi-i{yAk,  yk,  fk+i  +  e*+i,  ■  •  • ,  /,c_2  +  £1-2)]- 

Furthermore,  let  f/  denote  the  row  vector  (/fcc+1,  /fcc+2,  . . . ,  ff_x).  Then, 


r**  =  ± 


/  /j[y>i*,  2/fc,  if  +  &]g(£\yAk,  yk)d£, 


where  flf^ly^,  yk)  represents  the  joint  probability  density  function  for  the 
variables  included  in  e7,  conditional  on  Yk  =  yk  and  Y^  =  yAk. 

Next,  following  Leibniz's  Rule  (Section  3.3),  we  can  interchange  the  derivative 
and  integrals,  apply  the  product  rule  for  differentiation,  and  write 
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r     /3 
TElk]Ak    =    J   —[f^y^y,,  ff+£)g(£\yAk,yk)]d£  (5.32) 

=  J  lQTkMyAk,  yk,  f/  +  ii)]  g(§!i\yAh,  y^  (5.33) 

+  /  fi(yAk,  yk,  f7c'  +i/)  [«— g(£\yAk,  Vk)]d&- 

Given  the  classical  assumptions  for  continuous  path  models,  that  is,  mutual 
independence  among  the  ek,  k  =  1,  2,  . . . ,  p,  the  second  term  in  Equation  5.33 
above  is  zero.  As  was  seen  in  Section  5.3,  the  derivative  in  this  term  is  zero  because 
g{(!i\yAk,  Vk)  is  not  a  function  of  the  given  values  of  Y^  or  Yk  when  the  c7  terms 
are  independent  of  Y^fc  and  ek.  Also,  ^(e'/ly^,  Vk)  is  independent  of  y^  and  yk 
when  the  ek  are  independent.  We  denote  the  density  of  c7  by  g{ij)  in  this  case. 
Therefore,  when  the  ek  are  independent,  we  can  write  Equation  5.32  as 

=    E^My^y^ff  +  eJj)}.  (5.34) 

By  applying  the  MVCR  inside  of  the  expectation,  we  write  Equation  5.34  as 

TF  p  ,dfi(yAt)        dfi    dfk+i        dfi    dfck+2 

'      dyk  dyk+i    dyk        dyk+2   dyk 

,      ,    dfi  dfu  |    dfi  dfu 

dyi-2  dyk       dyi-X  dyk 
Note  that  each  of  the  |g  terms,  i  =  k  +  2,  k  +  3,  ...,/-  1,  can  be  further  expanded 
into  sums  of  products  of  partial  derivatives  via  recursive  applications  of  the  MVCR. 
As  an  example,  we  can  write    gk+3  as 
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dfk+s  d/fc+3  |  dfk+zdfck+2 

dyk  dyk        dyk+2   dyk 

dfk+3       d/fc+3  r  dfk+2       dfk+2  dfk+i . 
dyk        dyk+2    dyk        dyk+1    dyk 

Repeated  application  of  the  MVCR  to  the  factors  involving  an  fc  in  each  term 
of  Equation  5.35  until  no  compound  functions  remain  in  the  expansion  yields  an 
expectation  of  the  sum  of  the  direct  effect  and  all  possible  indirect  effects,  which 
themselves  are  products  of  direct  effects,  as  given  in  Definition  5.4.3.  Taking  the 
expectation  through  the  sum  of  the  expansion  gives  an  analog  to  the  COC  for  the  p 
variable  case,  conditional  on  the  antecedent  variables,  Y^ ,  and  on  Yk.  □ 

Note  that  there  is  a  different  value  of  TElk\Ak  for  each  setting  of  yk,  yAk.  In 
practice,  it  may  be  appealing  to  average  these  over  all  possible  values  of  y^  in 
order  to  summarize  the  effects  of  yk.  Such  average  total  effects  can  be  written  in 
COE  form  as  the  sum  of  average  direct  and  indirect  effects,  where  the  average  is 
taken  over  the  distribution  of  e7,  as  in  Theorem  5.5.1,  and  then  over  the  distri- 
bution of  YAk  or  some  arbitrarily  chosen  "standard"  distribution  of  Y^fc .  When 
the  standard  distribution  weighs  each  possible  value  of  YAk  equally  then  the  aver- 
age over  YAk  produces  effects  that  are  analogous  to  those  based  on  least  squares 
means  in  the  analysis  of  covariance.  When  weighed  according  to  any  arbitrarily 
chosen  distribution  of  YAk ,  the  average  produces  effects  analogous  to  those  based 
on  standardized  rate  estimates  in  epidemiologic  studies.  It  should  be  noted,  how- 
ever, that  such  averages  of  conditional  total  effects,  TEtk\Ah,  are  not  equal  to  the 
unconditional  total  effect,  TEik,  in  general. 

Also,  for  the  case  of  &  =  1,  we  point  out  that  there  are  no  antecedent 
endogenous  variables  to  in  the  conditioning  argument  and  Theorem  5.5.1  holds 
by  eliminating  all  Y^  from  each  expression  in  the  statement  of  the  theorem  and 
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proof.  In  this  case,  we  have  an  unconditional  TE.  Hence,  the  DE  and  all  IEs  are 
not  functions  of  Y^fc . 

It  is  sometimes  useful  to  distinguish  IEs  based  on  the  number  of  DEs  in  the 
product  that  defined  the  IE.  For  this  purpose,  we  define  the  order  of  an  indirect 
effect,  IEi(Q)k,  to  be  the  number  of  elements,  q  say,  in  the  set  Q.  That  is,  the  order 
of  an  IE  of  Yk  on  Yj  is  the  number  of  intermediate  variables  between  Yk  and  Yj  that 
are  involved  in  IE^Q)k.  Note  that  a  qth.  order  IE,  q  =  0,  1,  2,  . . . ,  m  =  I  —  (k  +  1), 
by  Definition  5.4.3  is  the  product  of  (q  +  1)  DEs. 

Combinatorial  mathematics  yields  some  interesting  facts  to  note  regarding 
the  sum  of  expected  effects  in  the  COE  partitioning  of  TEik\Ak  in  Theorem  5.5.1. 
Let  the  notation  mCT  denote  the  quantity  mCr  —  ru™'Lry  (McClave  et  al.  [46]). 
Because  m  denotes  the  number  of  variables  between  Yk  and  Yj,  there  will  be 
mCo  =  1  "zero-th"  order  IE  terms  (i.e.  the  direct  effect  term).  Furthermore, 
there  will  be  mC\  =  ra  first  order  IE  terms  and  mC2  second  order  IE  terms.  This 
continues  until  there  is  only  mCm  —  1  rath  order  IE.  The  total  number  of  terms  in 
the  COE  partitioning,  therefore,  is  YlT=o  mCr,  the  number  of  elements  in  the  power 
set  2B'k. 

5.5.3     Estimation  of  Model  Parameters,  Direct  and  Indirect  Effects 

The  parameters  in  any  single  equation  of  a  nonlinear  path  model,  conditional 
on  the  endogenous  variables  on  the  right-hand  side  of  that  equation,  can  be 
estimated  using  the  method  of  Nonlinear  Least  Squares  (NLS).  That  is,  the 
nonlinear  conditional  mean  function  can  be  fit  using  nonlinear  regression  techniques 
and  the  associated  properties  of  these  NLS  estimators  hold  in  application  to  path 
models.  From  methodology  and  results  given  by  Gallant  [17,  p.  16-17],  we  now 
present  estimation  methods  of  the  model  parameters,  using  the  notation  in  the 
path  models  in  this  dissertation. 
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■HO) 
Based  on  an  initial  estimator  of  6  ,  say  3     ,  an  updated  estimate  of  ^  is 

obtained  by  calculating 

where  F][    is  the  nxpk  matrix  of  partial  derivatives  with  ith  row  defined  by 


(o)  =  dfk(yAki;Pk) 
ki  '  88' 


evaluated  at  8    =  (3      and  the  ith  element  of  the  n  x  1  vector  el '  is  defined  by 


Ho)  t  (  rt0\ 

4,  =yki-fk(yAki;P.  )• 


This  estimation  process  is  repeated  by  substituting  3  for  f3    ,  F^m-1^  for  FJ^ 

and  e™      for  e^.    at  the  rath  iteration  and  iterating  until  convergence.  Denote  the 
final  estimator  by  j3  . 

For  random  variables  Yk,  k  =  1,  2,  . . . ,  p,  related  through  the  structural 
equations  given  in  Equations  5.24-5.26  and  satisfying  assumptions  1  through  4  in 
Section  5.5.1,  Gallant  [17,  p.  16-17]  showed  that,  under  mild  regularity  conditions: 
1.  /^  converges  almost  surely  to  /3  , 

2-  sl  —  (X^=i^fci)/(n  ~  Pk)  converges  almost  surely  to  o\,  and 
3.  8,  follows  an  approximate  multivariate  normal  distribution  with  mean  8,  and 

— *  — k 

2 

variance  ^Tk~l  for  large  n, 
where  Tk  -  lim^^  ^F'fcFfc,  the  matrix  Ffc  is  given  by 

dfk(yAk;Pk) 
¥k         ' 
and  where  the  variance-covariance  matrix  is  estimated  by  s2k[F'k(J3  )Fk0  )]~1. 


**  Hi) 

Asymptotic  properties  of  §_k  are  the  same  as  those  of  £?      (Fuller  [16]).  Note 


that 
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g]    =    H0,  +  (FfFf)-'pWi<») 


WO)  - 

Now,  from  the  Taylor's  series  expansion  of  fk  about  Bk    in  the  fcth  equation,  we 
have  that 


40)   =  [yki- fk{yAk;dk  )] 

=  F(k°\pk-d{k0))  +  Op(l/n)  +  ek 

=  Fi0)A/3fc  +  6fc  +  Op(l/n) 

and 


Al    =    (FfFf)-'FC»'ei»» 

=    (^,-H'>,)  +  (FrFfr'FW4t  +  0I,(l/„). 

Assume  that  the  initial  estimators  [3      and  Q       k  ^  fc',  are  independent. 
Thus,  Fk    and  FJ/,  fc  ^  A;',  are  independent  and,  it  follows  from  the  independence 
of  ek  and  ek>,  that  A/3fc  and  A/3fc,  are  independent.  It  follows  directly  that  g/  and 

/3fc,    are  independent  and,  hence,  /?fc  and  /3fe,  are  independent  because  they  have  the 

Wl)  Wl) 

same  asymptotic  properties  as  3      and  3  ,  . 

Thus,  from  the  results  of  asymptotic  normality  of  each  individual  8  ,  k  = 

— fc 

1,  2,  . . . ,  p,  together  with  the  independence  of  each  J3.  and  /?,,  for  all  fc  7^  fc', 

— fc  — K 

it  is  true  that  the  Nonlinear  Least  Squares  estimator  ft  =  (J3  ,  p  ,...,  8  )  of 

—  — 1     — 2  — p^ 

§L=  (§!v  £2'  •  •  • '  £D'>  whicn  is  formed  by  the  concatenation  of  each  individual  8 
vector,  fc  =  1,  2,  . . . ,  p,: 
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1.  follows  an  approximate  normal  distribution,  for  large  n,  with  mean  /3;  and, 
based  on  the  results  of  independence  shown  immediately  above;  and 

2.  has  block  diagonal  variance-covariance  matrix  with  the  fcth  block  formed  by 


T2 


fi.T-1 

n  1k 


5.5.4    The  Model  with  Linear  Predictors 


The  nonlinear  path  model  with  linear  predictors  is  defined  by 

Y1    =    Th  +  d  (5.36) 

Y2    =    /2(rfe)  +  €2  (5.37) 

yp    =    /PM  +  ep,  (5.38) 

where  r/jt,  k  =  1,  2,  . . . ,  p,  is  the  linear  combination  of  antecedent  endogenous 
variables  defined  by 

and  is  called  the  linear  predictor.  The  model  assumptions  are  Assumptions  1-4  in 
Section  5.5.1,  together  with  the  additional  assumption: 


5.  fk,  k  =  1,  2,  . . . ,  p,  are  monotone  in  %,  that  is,  ^  ^  0  for  all  values  of 


drik 


Vk- 


We  briefly  consider  the  case  with  four  endogenous  variables,  as  shown  previ- 
ously in  Equations  5.6-5.9  in  Section  5.3,  again  omitting  exogenous  variables  for 
simplification.  Under  the  independence  assumption  on  model  errors,  as  in  Equation 
5.14,  the  quantity  T-E^can  be  written  as 


TEU    =    E.,,,,1^-]  (5.39) 
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where,  following  the  notation  in  McCullagh  and  Nelder  [47],  tj4  represents  the  linear 
predictor  Y'AB.  That  is, 

f)4    =    fo  +  PuYi+PiM  +  faYz. 

Note  the  fact  that  the  quantity  TE4i  above  can  partitioned  into  direct  and 
indirect  effects.  More  specifically,  we  can  write 


TE41  =  DE41  +  IE421  +  IE431  +  IE432l  (5.40) 

where,  for  example, 


rdhfa). 


p       rdf4(r]4)  dr]4 
~    *™{    dV4     dVll 


We  can  also  write  IE42i,  for  example,  as 


Tf               f       rd/4M  dhfa) 
1E421    -    £62,£3  — 

OV2        oyx 

E       ,df4(r)4)  drj4  dfcfo)  dr)2 
drj4     dy2     dr)2     dyr 
R    r    f       <df4{r)4)  df2{r)2) 

-      Pi2P2\Ei2^3[-~ — . 

dr]4         dr}2 

To  generalize  Equations  5.39  and  5.40  to  the  p  variable  case  and  to  the  case  of 
an  arbitrary  TEtk\Ak,  we  note  that,  under  the  model  assumptions, 


TF  ,        -  F  rW, 
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or,  equivalently, 

,dfi(vi)- 
[    dyk    ■ 

Since  the  e^s  are  independent  of  the  previous  Yjt's,  we  write 


1  hlk]Ak  -  he_ilYAktYk[-^—  j. 


TElklAk    =    E€J^]  (5.41) 

=    EejDElk]+     Yl      ^[%)*1  (5-42) 

=  *i*f  £i+  e  *i  n  ^-gi  (-) 

'  *  Qe2B<*-0  (*M')€I(Q)*  '  ** 


=    PuE./JM]  (5.44) 

+  e  {<  n  m**i  n  *£>». 

Q62B//t_0     (k',V)ei(Q)k  (k',v)ei(Q)k        " 

for  1  <  fe  <  /  <  p. 

Note  that  Equation  5.42  above  is  a  direct  result  of  Theorem  5.5.1.  Also, 
note  that  the  term  ^  yields  the  coefficient  /3/fc,  which  represents  the  coefficient 
for  %  on  YJ  in  the  linear  predictor  for  Yj.  Therefore,  to  test  the  significance  of 
expectations  of  direct  or  indirect  effects,  we  can  test  the  /3  coefficients  or  products 
of  the  (}  coefficients.  We  state  this  formally  in  the  form  of  a  theorem  below.  An 
applied  example  of  hypothesis  testing  in  a  case  such  as  this  is  shown  in  Section 
8.1.6. 

Given  the  path  model  with  classical  assumptions  and  nonlinear  conditional 
mean  functions  of  linear  predictors,  we  show  that  tests  of  average  direct  and 
indirect  effects  can  be  achieved  via  tests  of  the  associated  coefficients  and  products 
of  coefficients  in  the  linear  predictor.  The  specific  result  is  given  in  the  following 
theorem. 
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Theorem  5.5.2.  For  the  nonlinear  path  model  with  linear  predictors,  as  defined 
by  Equations  5.36-5.38  and  assumptions  1-5,  testing  the  null  hypothesis  H0  : 
E€l(IEmk)  =  0,  for  a  specified  Q  G  2Blk,  Blk  =  {k  +  1,  k  +  2,  ...,/-  1},  is 
equivalent  to  testing  the  null  hypothesis 

H0  :         J]       A'*'  =  0, 

(k',i')ei(Q)k 

where  (k',  I')  denotes  adjacent  indices  in  the  set  l(Q)k  =  {l}l)Q\j{k}.  Furthermore, 
an  asymptotic  test  of  this  hypothesis  can  be  based  on  the  statistic 

z  _         U.(k',i>)ei(Q)kPi'k' 

yVar(U(k',i')ei(Q)k^i'k') 

where  Z  ~  N(0,  1)  given  H0  when  n  is  large,  and 

Va7r(      J]       §,*,)  =        J]       [Va~r0l!k,)+fi,k,]-       J]       $k„ 
(k',v)ei(Q)k  (k',i')ei{Q)k  (k',i')ei{Q)k 

where  Var((3i>k<)  represents  the  k'k'th  element  of  the  variance- covariance  matrix  of 

I- 

Proof.  From  Theorem  5.5.1,  we  know  that,  given  the  model  assumptions,  the 
quantity  TElk\Ak  can  be  written  as  the  sum  of  an  average  direct  and  average 
indirect  effects  as  in  Equations  5.41-5.44  above.  Thus,  for  an  arbitrary  average 
indirect  effect,  say  IE^g^k,  we  write 


drli'  w   r       TT        4fl'M: 

(k',l')Z 

Due  to  the  form  of  the  linear  predictor,  we  can  write 


IEmt  -.   n  ^M  n 

(k',l')ei(Q)k    yK  (k',i')ei(Q)k        " 


n  £-  n  **• 

(k',i')ei(Q)k    **        (k',i')ei(Q)k 
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Hence,  we  can  write  average  direct  and  average  indirect  effects  as  products 
of  the  associated  j3  coefficients  from  the  linear  predictor  terms  of  the  variables 
along  the  path(s)  of  the  effect  of  interest,  multiplied  by  the  expectation  of  the 
associated  products  of  the  derivatives  of  the  fu  functions,  with  respect  to  the  linear 
predictors,  r)k-  For  the  standard  distributions  presented  in  McCullagh  and  Nelder 
[47],  Ey,\ya  ,Yh[  d     ]  ¥"  0  because  the  partial  derivative  of  the  ft  function  with 
respect  to  the  linear  predictor  rji  is  either  a  strictly  positive  or  strictly  negative 
function  and,  therefore,  has  expectation  that  is  nonzero.  Thus,  since  the  portion 
within  the  expectation  is  always  nonzero,  testing  the  significance  of  the  particular 
direct  or  indirect  effect  of  interest  is  equivalent  to  testing  the  significance  of  the 
quantity  outside  the  expectation,  that  is,  FJ(fc,  i>)eiin)k  A'fc'- 

Recall  that  ft  and  (3     are  independent  for  all  k  ^  k'  under  the  independence 
assumption.  Also,  /3    has  an  asymptotic  normal  distribution  for  all  k.  Using  a 
Taylor's  series  expansion  about  /3,  it  is  easily  shown  that  the  product  of  asymp- 
totically normal  random  variables  is  also  asymptotically  normally  distributed. 
That  is,  Y[<k',i')ei(Q)k  Pi'k1  follows  an  approximate  normal  distribution  with  mean 
Yl(k\i')ei(Q)k  Pi'k'-  The  variance,  denoted  by  Var(Y[^k,^el^k  /%##),  is  derived  below. 

Using  the  fact  that  the  variance  of  a  random  variable  X  can  be  written  as 

V(X)  =  E(X2)  -  E2(X),  (5.45) 

the  quantity  Var(Y\{k,  V)€i(Q)kPvk')  in  the  denominator  of  the  test  statistic  shown 
above  can  be  written  as 


Var(      []       8*)    =    E[(      J]       M2}-E2[      J]       §,*,].      (5.46) 

(k',l')el(Q)k  (k',l')el(Q)k  (k',l>)€l(Q)k 


Note  that 
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En  n  ?^')2i  =  ^(  n  %*) 

(k',v)ei(Q)k  (k',V)ei{Q)k 

=   n  e$w) 

(k',l')€l(Q)k 

(Mood  et  al.  [50]),  and 

e2[  n  m=  n  e2(m 

(k',l')el(Q)k  (k',l')el(Q)k 

because  #y  is  independent  of  ^'j'  when  k  ^  k'.  We  can  write  Equation  5.46  above 

as 


var(  n  afc,)=  n  ^(&o-  n  #&*•)■     (5-47) 

(fc',i')eJ(Q)fc  (fc',t')€J(Q)*  (fc',J')e/(Q)fc 

Now,  again  using  the  computational  formula  given  in  Equation  5.45  and 
expressing  E(X2)  as  V(A")  +  E2(X),  we  rewrite  Equation  5.47  above  as 


Var(      Yl      %y)=       \{      [Var0llk,)+E2@vk,)]-       ]J       E2{fivv).  (5.48) 
(k',i')ei(Q)k  (k',i')ei(Q)k  (k',i')ei(Q)k 

For  large  n,  we  can  approximate  this  variance  by 


Var(      []       %,»)=       Y[      [Varfov) +&,„]-       U       jflj*.  (5-49) 

(*',J')ei(Q)fc  (k',i')ei(Q)k  (k',i')ei(Q)k 

where  Var  ($/*/)  represents  the  fc'th  diagonal  element  of  the  variance-covariance 
matrix  of  /^,,  (that  is,  ^-7V)  and  can  be  obtained  from  the  single  equation  estima- 
tion Of  /3('fc/. 
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Therefore,  under  the  null  hypothesis  and  for  large  n,  since  the  numerator 
[\,k,  ii)€iin)k  Pi'k'  is  approximately  normally  distributed  and  the  denominator 
Var(U.(k',i')ei(Q)kPi'i<')  is  a  consistent  estimator  for  Var(Yl^k,>v)€^Q)k  /3t>k>),  then  by 
Slutsky's  Theorem  (Ghosh  [19]),  the  quantity 


z=       !!(*',  i')gi(Q)fc /W 


y/Var  (lip,  v)&m  fov) 

is  asymptotically  distributed  as  a  standard  normal  random  variable  with 

^ar(Il(fc',j')ei(Q)fc/W)  as  in  Equation  5.49.  □ 

Recall  that,  using  the  Q  notation  above  implies  that  when  Q  =  0,  we  are  actually 


testing  the  null  hypothesis  H0  :   DEik  =  0,  which  is  equivalent  to  the  test  of 

H0  '•  Pik  =  0.  Hence,  the  above  theorem  allows  for  tests  of  both  direct  and  indirect 

effects. 

5.6    Estimation  and  Testing  in  Path  Models  with  Exponentially  Distributed 

Endogenous  Variables 

We  now  derive  Maximum  Likelihood  (ML)  estimators  of  the  parameters 

in  the  model  in  Equations  5.24-5.26  and  their  asymptotic  distribution  under 

the  assumption  that  the  endogenous  variables  are  conditionally  exponentially 

distributed. 

5.6.1     The  Model  and  Assumptions 

Consider  the  model  defined  in  Equations  5.24-5.26  with  Yk's,  k  =  1,  2,  . . . ,  p, 
that  are  exponentially  distributed,  conditional  on  previous  endogenous  variables. 
Assume  that: 

1.  The  error  terms,  ek,  k  =  1,  2,  . . . ,  p,  have  zero  expectation,  conditional  on 
Yi,  Y2,  . . . ,  Ffc-i- 
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2.  The  Yj;  variables  are  continuous  with  conditional  mean  functions,  fk,  that 
are  partially  differentiable  with  respect  to  the  parameters  in  the  model  and 
previous  endogenous  variables. 
Note  that,  with  this  particular  model,  the  e^  terms  can  not  be  independent, 
unless  the  variance  function  is  functionally  independent  of  the  mean  function,  as 
is  the  case  when  the  Y^s  are  normally  distributed.  The  assumption  of  normally 
distributed  errors  along  with  that  of  zero  conditional  expectation  means  that 
the  errors  are  mutually  independent  and,  hence,  is  addressed  in  Section  5.5.  In 
the  current  section,  we  consider  the  more  general  case  where  the  errors  are  not 
necessarily  independent. 

5.6.2     Maximum  Likelihood  Estimation 


Before  discussing  maximum  likelihood  estimation  of  model  parameters,  we 
briefly  present  some  useful  terminology.  Limited  Information  Maximum  Likelihood 
(LIML)  refers  to  the  maximum  likelihood  (ML)  estimation  of  parameters  contained 
within  a  single  equation,  using  only  the  information  contained  in  observations  of 
variables  in  that  equation.  That  is,  LIML  estimates  the  parameters  of  the  kth 
equation  are  calculated  from  observations  of  variables  Y\  through  Yjt,  ignoring  all 
other  structural  relationships.  This  method  is  distinguished  from  Full  Informa- 
tion Maximum  Likelihood  (FIML)  estimation,  where  all  parameters  within  the 
system  are  estimated  simultaneously.  The  FIML  method  uses  information  on  the 
endogenous  variables  within  the  system  and,  in  general,  takes  into  account  the 
error  covariances  across  equations  to  estimate  parameters.  More  explicitly,  FIML 
estimates  the  parameters  of  the  kth.  equation  using  all  of  the  information  avail- 
able on  all  endogenous  variables  in  the  entire  system  and,  also,  using  information 
regarding  covariances  among  these  endogenous  variables.  FIML  estimation  is  a 
system  generalization  of  LIML  estimation  [58]. 
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We  now  present  the  derivation  of  LIML  estimates  of  parameters  in  path  mod- 
els with  conditional  exponentially  distributed  response  variables.  The  derivation 
amounts  to  a  modest  generalization  of  ML  estimation  of  the  parameters  of  the 
generalized  linear  model  considered  by  Agresti  [1]  and  McCullagh  and  Nelder 
[47].  Agresti  [1]  and  McCullagh  and  Nelder  [47]  consider  models  where  the  mean 
function  may  be  a  nonlinear  function,  but  of  a  linear  predictor,  r\.  We  consider  the 
generalization  where  the  mean  function  can  be  any  nonlinear  function  satisfying 
assumption  (2)  above.  This  generalization  is  needed  in  the  current  study  because, 
here,  we  consider  nonlinear  conditional  mean  functions  that  are  not  necessarily 
functions  of  a  linear  predictor. 

Theorem  5.6.1.  Let  each  Yk,  k  =  1,  2,  . . . ,  p,  have  an  exponential  distribution 
conditional  on  Y^k  ■  That  is,  suppose  the  conditional  density  function  of  Yk  given 
Yyt,  =  yAk  is 

9Yk\YAk(yk\0k(yAk),  <f>k)  =  exp{[yk9k  -  bk(9k)}/ak(</)k)  +  ck{yk;  cj)k)} . 

Let  fk(yAk',§_k)  denote  the  conditional  mean  function  E(Yk\Y Ak  =  YAk)-  Assume 
that  fk  satisfies  the  mild  regularity  conditions  required  for  nonlinear  least  squares  to 
have  the  usual  properties  (Gallant  [17,  p.  156-185]),  fi    is  a  vector  of  parameters 
of  dimension  pk  and  that  (j)k  is  known.   Then  the  limited  information  maximum 
likelihood  estimator  of  ft,  based  on  n  independent  observations  (yki,  yAki),  i  = 
1,  2,  . . . ,  n,  from  n  individuals,  is  obtained  through  the  following  4  step  iteratively 

reweighted  nonlinear  least  squares  procedure: 

wo) 

1.  Obtain  an  initial  estimator  of  f3  ,  say  f3     ; 

2.  Calculate  the  nx  n  weight  matrix 

^>_i       ,.     (dfk{yAku  Pk  ) 
V{0)=diag{[ —   ~k     ak(<f>k)]  l}; 

3.  Obtain  an  updated  estimate  of  f3    by  calculating 
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^^[FfV-FiT'FWV-ef 


where  the  n  x  pk  matrix  ¥k    is  the  matrix  of  partial  derivatives,  with  i-th  row 
defined  by 


Ff  -  d-^f^  (5.50) 

—k 

evaluated  at  /3    =  ft     ,  and  the  i-th  element  of  the  n  x  1  vector  cl    is  defined  by 


Ho)  t  i  o(°)\ 

%  -  2/fci  -  My^;  pk  j; 

^(m-1)  -~{0) 

^.  Iterate  steps  2  and  3,  substituting  (5  for  (3      at  the  m-th  iteration, 

iterating  until  convergence. 

Proof.  We  shall  use  the  shorthand  notation  g(yk)  —  9Yk\YA  (yk\8k{yAk),  fa)  in  the 
proof.  The  proof  follows  that  of  McCullagh  and  Nelder  [47]  for  the  special  case 
where  fk  is  a  nonlinear  function  of  a  linear  predictor,  r)k,  was  considered. 

For  the  /cth  variable  in  the  causal  system,  Yk,  k  =  1,  2,  . . . ,  p,  the  log 
likelihood  is  written  as 


^ogg(yki)    =    ^2{[ykieki-bk(9ki)]/ak((t)k)  +  ck(yki;4>k)},  (5.51) 

i=l  i=l 

which  shall  be  denoted  by  lk.  The  subscript  i  is  used  to  represent  the  ith 
subject.  The  zth  individual's  contribution  to  the  (fcj)th  likelihood  equation, 
j  =  1,  2,  . . . ,  pk,  used  to  develop  an  estimator  for  /3kj,  can  be  written  as 

dlki  _    dlki  ddki  dnki 

d(3kj~  dekidmd(3kj  (5"52) 

where  fiki  =  fk(yAku  §_k).  For  the  exponential  family  of  distributions  defined  above 
by  9Yk\YAk(yk\0k(yAk),  fa),  the  following  relationships  hold: 
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h{°ki)  =     dg        =  Vki,  (5.53) 


W«)  =  ^  =  -^-y  (5-54) 


where 


Vki  =  Var(Yki) 


and 


Of-    =    [yki-Vk(9ki)]/ak(<f>k)  (5.55) 

=    (Vki-  ti>ki)/ak((f>k).  (5.56) 

Using  Equations  5.53-5.56  above  together  with  Equation  5.52,  the  iih  individ- 
ual's contribution  to  the  scoring  equation  for  f3kj  is 

dlki  {Vki  ~  Vki)  Qfc(^fc)  dfjLki 

d/3kj  ak(<pk)        Vki    dfaj 

(Vki  -  Vki)  d^ki 
Vki       dfikj' 

Accordingly,  the  scoring  equation  for  /3kj  is  given  by 

^r^~^~^T0-  (5-57) 

These  likelihood  equations  are  nonlinear  functions  of  /^  and,  thus,  solving 
them  for  /3fc  requires  iterative  methods  (Agresti  [1]).  The  Fisher  scoring  method 
(Agresti  [1],  McCullagh  and  Nelder  [47])  of  solving  the  equations  defined  by 
Equation  5.57  involves  the  information  matrix,  1(8  ),  with  jfth  element  given  by 
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l»  -  E[v£k]  (5-58) 

=    El(*-«-)-^(^)]  (S-60) 

+  ?B|^a^-aA;(!"'-'"i)1' 

The  first  term  of  Equation  5.60  is  zero  and,  hence, 


=    (F'kV^Fk)jr  (5.62) 

where  V^"1  =  D*0(/(V^'1),  i  =  1,  2,  . . . ,  n,  and  the  Ffc  matrix  is  of  dimension 
(nxpk)  and  is  defined  by 

F    -^ 

where  //fc  =  (/xfcl,  /ifc2,  •  •  • ,  aO'- 

Via  the  Fisher  scoring  method,  the  mth  approximation  for  the  ML  estimator, 


Wm+l)         -^im)         .  -^(m) 

fi*         =  £     +  AS  (5-63) 


where 

'-k         _  VA'fc        vfc(m)x'fc      ^       *  k        vfc(m)tfc 

and  Ffcm  V^^F^  is  the  mth  approximation  to  the  Information  Matrix  with 
j'j'th  element  given  in  Equation  5.62,  obtained  by  evaluating  these  elements  at 
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The  iteratively  reweighted  nonlinear  least  squares  estimator  at  the  (m  +  l)th 
iteration  is  calculated  as  the  solution  to  the  following  normal  equation  (SAS/STAT 
User's  Guide  [58]): 

Pfv^+11  =  P^  V»1Fim)gm)  +  Fi^VJ"^.  (5.64) 

Wm+l) 

Note  that  the  solution,  p         ,  is  given  in  Equation  5.63. 

Hence,  the  LIML  estimators  that  result  from  solutions  to  the  likelihood  equa- 
tions given  in  Equation  5.57  are  solutions  to  the  generalized  nonlinear  "normal" 
equations  and  are  those  obtained  through  iteratively  reweighted  nonlinear  least 
squares  for  the  fcth  equation.  □ 

Now,  consider  the  fact  that  we  have  a  sequence  of  structural  equations,  one 
each  for  Yi,  Y2,  . . . ,  Yv.  Assuming  that  each  endogenous  variable  has  an  exponential 
conditional  density  function,  given  antecedent  endogenous  variables,  we  can  write 
the  joint  density  for  this  sequence  of  p  variables  as 

g(yi,  2/2, 2/3,  •  ■  • ,  yP;  Pv  §^,  A,,  ■  ■ . ,  /y   =  Pvi (2/1 ) py2 1  vi (2/2 1 2/1 ) 

•^aiY^CyslyAs)  •  ■■9Yp\YAp(yP\yAp) 

where 

9Yr\YAr{yr\yAr;  Pr)  =  exp{[yr6r  -  br{er))/ar{(j)T)  +  cr(yr;  (f>T)} 

represents  the  conditional  density  function  of  Yr,  r  =  1,  2,  . . . ,  p,  given  YAr  =  yAr, 
and  9r  is  a  function  of  y^,. .  Hence,  the  contribution  to  the  likelihood  function  for 
the  zth  subject  is 


Li  =  g(yu,  V2i,  •  •  • ,  yPi)  =  [JJ 9Yr\YAr (ynlYAri)^ (Vii)  (5.65) 

r=2 

where 

9Yr\YAr(yri\yAri)  =  exp{[yrieri  -  br(6ri)]/ar{(j)r)  +  cr(yri;  <f>r)} 
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and 

6ri  =  9r(yAri)- 

We  can  then  express  the  full  information  likelihood  function  for  all  p  variables 

as 

n  n  p 

Y[Li  =  Y[{[Y[9Yr\YAr{yn\yAri)]gYi{yii)}  (5.66) 

i'=l  i=l      r=2 

and  the  log  of  the  full  information  likelihood  as 

n  n  p 

J2li  =  ^2{^2loS9Yr\YAr(yri\yAri)}  +  loggYl(yu)},  (5.67) 

t=l  i=l       r=2 

where  U  =  logL,.  We  shall  denote  YJl=i  U  as  K@)-  Note  that  the  FIML  estimator 
of  0  is  calculated  as  the  simultaneous  solution  to  the  scoring  equations 

dl{p) 

for  A;  =  1,  2,  . . . ,  p  and  j  =  0,  1,  . . . ,  pk.  The  following  theorem  shows  that 
the  FIML  estimators  of  parameters  in  models  with  conditionally  exponentially 
distributed  endogenous  variables  are  the  LIML  estimators  and  are  asymptotically 
independent  and  normally  distributed. 

Theorem  5.6.2.  Let  Y\  be  an  exponentially  distributed  random  variable  and  let 
Yk  have  conditional  exponential  distribution  given  Y Ak,  for  each  k  =  2,  3,  . . . ,  p. 
Further  assume  that  the  Yk  are  related  through  the  structural  equations  given  in 
Equations  5.24-5.26,  satisfying  assumptions  1,  3  and  4  of  that  model.   Then  the 
following  three  statements  hold  true: 

1.   The  FIML  estimator  of  ft  =  (£,  #,,  ...,£)'»?=  (g,  £,...,  /?)', 
where  each  /3fc,  k  =  1,  2,  . . . ,  p,  are  the  single  equation  LIML  estimators  defined  in 
Theorem  5.6.1; 
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2.  f3_    is  asymptotically  independent  of  /3    for  all  k  ^  k',  k  =  1,  2,  . . . ,  p, 
k'  =  1,  2,  . . . ,  p; 

3.  For  large  n,  (3    is  approximately  normally  distributed  with  mean  (3    and 
variance  matrix  (FJfeV^1Ffc)_1  defined  in  Equations  5.62. 

Proof.  To  obtain  the  FIML  estimators,  we  must  solve  the  Y7k=i  Pk  scoring 
equations  defined  in  Equation  5.68  for  fi.  Note  that  each  (3kj  parameter,  j  = 
0,  1,  . . . ,  pk,  will  only  occur  in  the  structural  equation  for  the  A;th  endogenous  vari- 
able. Therefore,  the  fikj  coefficient  will  only  occur  in  the  function  <7rfc|Y,4  (yk\yAk)  in 
Equation  5.67.  Hence,  the  derivative  of  the  right-hand  side  of  Equation  5.67  with 
respect  to  faj  is  written  as 


y    dh 


dPkj  ~  dPkj 


71      fi 

=  Yl  Qj^.\-l°Z9Yk\YAk  {ykilyAui)] 


because  it  is  true,  for  k  /  k',  that 


-[log9Yk,\YAu(yk>i\yAk,i)}  =  0. 


d/3kj 

This  is  the  same  likelihood,  as  in  Equation  5.51,  that  results  from  LIML  estimation 
and,  consequently,  differentiating  the  full  likelihood  with  respect  to  parameters 
from  the  equation  for  the  A;th  endogenous  variable  amounts  to  simply  differenti- 
ating the  likelihood  equations  associated  only  with  the  kth.  model.  Therefore,  in 
strictly  ordered  path  models,  the  ML  estimator  obtained  from  FIML  is  equivalent 
to  the  ML  estimator  obtained  from  LIML. 

We  now  show  that  the  LIML  estimators  of  the  parameters  in  any  one  struc- 
tural equation  are  asymptotically  independent  of  the  LIML  estimators  of  those  in 
any  other  equation. 
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An  arbitrary  element  in  the  FIML  information  matrix,  I(/3),  is  denned  by 


■«J!@-) 


'dfadft 


for  some  choice  of  k,  j,  k'  and  f. 
For  k  ^  k' ',  note  that 

because  -gg=-  does  not  depend  on  parameters  of  the  k'tb.  equation  when  k  7^  k'. 
Thus,  we  can  express  I(/5)  as 


E( 


)  =  0 


m 


l(&) 

Q 

Q       0 

0 

0 

I2(&) 

Q        Q 

0 

0 

0 

13(^3)    Q 

0 

0 

0 

Q 

0 

0 

0 

Q       Q 

UPP) . 

(5.69) 


where  I* (/?,),  fc  =  1,  2,  . . . ,  p,  is  of  dimension  pk  x  pfc  with  jfth  element 


K!-k 


Kapkjdpk/- 

Thus,  the  FIML  1(0)  is  a  block  diagonal  matrix,  where  the  fcth  block  is  Ijt  (/?,), 
the  LIML  information  matrix  for  the  fcth  equation  parameters.  Since  the  FIML 
estimator  0  is  asymptotically  normal  with  covariance  matrix  I-1  (/?),  it  follows  that 
the  estimators  of  /^  and  /?fc,  are  asymptotically  independent  for  all  k  ^  k'. 

Under  mild  regularity  conditions,  the  ML  estimator  fl  is  approximately 
normally  distributed  with  mean  /?  and  variance  matrix  given  by  l~l{fi)  for  large 
n  (Mood  et  al.  [50]).  It  follows  from  Equation  5.69  that  each  §_k  is  approximately 
normally  distributed  with  variance  matrix  Ikl(P .). 
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In  practice,  the  above  result  means  that  the  /Jjy  parameters  may  be  estimated 
via  LIML  estimation  of  fl    and  that  these  LIML  estimators  can  be  treated  as 
independent  for  large  sample  sizes. 

In  the  next  section  we  present  an  asymptotic  test  of  the  hypothesis  that  an 
expected  direct  or  indirect  effect  is  nonexistent  when  the  endogenous  variables  are 
conditionally  exponentially  distributed  and  the  fk  each  are  functions  of  a  linear 
predictor,  r\k. 

5.6.3    Hypothesis  Testing  for  Generalized  Linear  Path  Models 

In  this  section  we  consider  the  special  case  of  the  model  in  Section  5.6.1  where 
the  fk  functions  are  functions  of  linear  predictors,  rjk  =  Pko+PkiYi  +  -  ■  '+Pk,k-iYk-v 
We  call  this  model  the  Generalized  Linear  Path  Model  because  of  its  communality 
with  the  Generalized  Linear  Model  of  McCullagh  and  Nelder  [47].  Note,  however, 
that  we  restrict  attention  to  continuous  Y  variables  in  this  chapter.  The  following 
Theorem  gives  an  asymptotic  test  of  the  hypothesis  that  an  expected  direct  or 
indirect  effect  is  zero  in  this  model. 

The  generalized  linear  path  model  is  defined  by 


Yi    =    Vi  +  ei  (5.70) 

Yi    =    /2(r72)  +  e2  (5.71) 

YP    =    fp(Vp)+ep,  (5.72) 

where  Yk,  k  =  1,  2,  . . . ,  p,  have  distributions  in  the  regular  exponential  family 
(McCullagh  and  Nelder  [47]),  rjk  is  a  linear  combination  of  antecedent  endogenous 
variables  called  the  linear  predictor  and  defined  by  rjk  =  Y'A  /3  .  Note  again,  as 
presented  in  the  proof  of  Theorem  5.5.2,  EY ,\Y A^Yk[^^}  ^  0  because  the  partial 
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derivative  of  the  fi  function  with  respect  to  r/;  is  either  a  strictly  positive  or  strictly 
negative  function  and  has  expectation  that  is  nonzero. 
Theorem  5.6.3.  For  path  models  defined  by  Equations  5.70-5.72  with  en- 
dogenous variables  following  conditional  exponential  distributions  and  fk  that 
are  functions  of  the  linear  predictor  r\k  =  Y'Ak/3  ,  testing  the  null  hypothesis 
H0  :  Ey,\ya  ,Yk{IEi(Q)k)  =  0,  for  a  specified  Q  G  2Blk,  is  equivalent  to  testing  the 
null  hypothesis 

H0  :        J]       fa  =  0, 

(k',l>)el(Q)k 

where  Blk  =  {k  +  1,  k  +  2,  . . . ,  I  —  1},  2B'k  denotes  the  power  set  of  Blk,  Q  is  any 

arbitrary  element  of2Blk  and  the  pair  (k1,  I')  denotes  a  pair  of  adjacent  indices  in 
the  set  {1}  U  Q  U  {k},  expressed  as  l(Q)k.  Furthermore,  an  asymptotic  test  of  this 
hypothesis  can  be  based  on  the  statistic 


„_  U.(k',l')el(Q)k^'k' 


\/Var(U(k',v)ei(Q)kM 
where 


Var(      J]       M=       H      [Var0Vk,)+&v]-       ]J       %w 
(k'j')ei(Q)k  (k',v)ei(Q)k  (k',i')ei(Q)k 

and  Z  ~  iV(0,  1)  given  H0  and  large  n  and  where  Var(^i,k,)  represents  the  k'th 
diagonal  element  of  the  variance- covariance  matrix  of  /?  . 

Proof.  To  estimate  the  expected  direct  and  indirect  effects  of  the  model  defined  by 
Equations  5.70  -  5.72  and  the  model  assumptions  above  in  Section  5.6.1,  we  must 
estimate  quantities  that  are  of  the  general  form 


EejiEmk]  =  (  n  pVk,)Eei[  n 


dfi'ivi')- 


(k',l')el(Q)k  (k',V)el(Q)k         ^' 
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Hence,  testing  the  null  hypothesis  H0  :  Eya  (IE^Q)k)  =  0  is  equivalent  to  testing 

the  null  hypothesis  H0  :  li(k;i')€i(Q)k^'k'  =  0  because  Eei[Yl{k',i>)£i(Q)k  ^T1]  ¥>  0. 

From  Theorem  5.6.2  above,  the  estimates  of  the  $/#  parameters  obtained 
from  single  equation  LIML  estimation  are  asymptotically  independent  and  are 
asymptotically  normally  distributed.  Thus,  following  the  proof  of  Theorem  5.5.2 
directly,  for  large  n,  [Lfc,  jngj/n)*  Pvv  has  an  approximate  normal  distribution 
with  mean  Il(fe',j')gj(Q)fc Pvu  an(i  variance  Var(Y[^kll,^€l^kpVk>),  estimated  by 
^ar(ri(fc',j')ei(Q)fc/^'fc')  as  derived  in  Equations  5.45  -  5.49.  That  is, 

(k',v)£i(Q)k  (k',v)ei(Q)k  (k',v)ei(Q)k 

where  Var(fliiki)  is  the  k'th  diagonal  element  of  the  variance-covariance  matrix  of 
/^,  and  can  be  obtained  from  the  single  equation  estimation  of  fii'v  ■ 
Hence,  the  quantity 

z  _  H(k>,V)el(Q)k  $Vk' 

\JVar(Il(k>,i>)ei(Q)kPi'k>) 
has  an  approximate  standard  normal  distribution  under  H0  and  for  large  n.  □ 


CHAPTER  6 
PATH  MODELS  WITH  DICHOTOMOUS  VARIABLES 

When  the  endogenous  variables  are  discrete,  then  the  fk(yAk',  §_,)  mean 
functions  are  discrete  valued.  Hence,  derivatives  of  fk  are  not  defined  and  cannot 
be  used,  as  in  the  continuous  variable  case,  to  define  total,  direct  and  indirect 
effects.  Thus,  a  COE  partitioning  of  total  effects  cannot  be  derived  as  in  Section 
5.5,  even  if  it  exists.  We  shall  show  in  this  chapter  that  a  COE  partitioning  does 
exist  for  models  with  dichotomous  endogenous  variables  and  derive  this  COE  using 
the  calculus  of  finite  differences  presented  in  Chapter  3. 

We  find  it  instructive  and  helpful  to  first  discuss  special  case  models  of  the 
general  model  containing  dichotomous  variables.  These  special  cases  will  be 
generalized  in  Section  6.4,  where  a  COE  partitioning  of  total  effects  in  a  causally 
ordered  chain  of  dichotomous  variables  is  derived. 

6.1     Definition  of  Effects 


We  now  present  a  formal  definition  of  a  TE  that  is  applicable  to  path  models 
involving  dichotomous  variables.  This  definition  is  analogous  to  the  definition 
presented  in  Definition  5.4.1  in  Section  5.4  for  models  with  continuous  variables, 
but  utilizes  difference  quotients  in  place  of  derivatives.  As  in  Chapter  5,  we  shall 
study  the  conditional  TE  of  variable  Yk  on  variable  Yh  given  YAk ,  for  any  k  and 
I  such  that  1  <  k  <  I  <  p,  where  Yp  is  the  last  variable  in  the  system.  Total, 
direct  and  indirect  effects  are  each  defined  and  denoted  as  below.  Also,  we  utilize 
notation  given  earlier  in  Section  5.4.  That  is,  we  let 

Y^  collectively  represent  all  variables  antecedent  to  Yk  in  the  causal  chain, 
Y7  collectively  represent  all  intermediate  variables  between  Yk  and  Yh  and 
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Y$  collectively  represent  all  variables  subsequent  to  Y\  in  the  causal  chain. 
Definition  6.1.1.  The  conditional  total  effect  ofYk  on  Y\  given  YAk  =  yAkis 
denoted  TEik\Ak  and  is  defined  by 

TElk\Ak    =    AyhEYl\YAk,Yk(Yi\yAk,yk)- 

Note  that,  in  the  definition  of  conditional  total  effect  given  immediately  above, 
we  adopt  the  convention  that  when  k  =  1,  we  write 

TEll  =  /\yiEYl\Yl{Yl\yl). 

Definition  6.1.2.  The  conditional  direct  effect  of  Yk  on  Yi  given  YAk  —  yAk  is 
denoted  DE[k  and  is  defined  by 

DEik  =  AyJi(yAk,  yk,  Y7). 

Definition  6.1.3.   The  conditional  indirect  effect  ofYk  on  Yi  through  an 
arbitrarily  selected  set  of  intermediate  variable(s),  with  associated  set  of  indices  Q, 
is  denoted  IE^q)k  and  defined  by 

iEl{Q)k  =      n     AvkMy*At,) 

(k',l')€l(Q)k 


y\,  =< 


where  the  jth  element  ofy*Ai,  j  <  I',  is  defined  as 

yj  +  hj    if  j  <  k'  andj  G  l(Q)k  —  I 

Vj  else, 

and  hj  is  such  that  yj  +  hj  €  {0,  1}. 

Note  again  here  that,  for  convention,  we  will  henceforth  refer  to  the  conditional 

direct  and  indirect  effects  as,  simply,  direct  and  indirect  effects. 
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6.2     Model  with  Three  Variables 
Consider  a  chain  of  three  variables  in  which  Yx ,  Y2  and  Y3  are  all  Bernoulli 
random  variables  with  arbitrary  conditional  mean  functions  f2  and  /3.  The  relevant 
set  of  structural  equations  for  this  system,  presented  in  less  detailed  notation  than 
previously,  is  as  follows: 

Y\    =    tti+ci  (6.1) 

Yi    =    ft(Yl)  +  ei  (6.2) 

Yz    =    f3(Y1,Y2)  +  e3.  (6.3) 

The  assumptions  for  this  model  are: 

1.  Yi  is  distributed  as  a  Bernoulli  random  variable  with  mean  7^. 

2-  Vfc,  given  Y^,  is  distributed  as  a  Bernoulli  random  variable  with  mean 

function  fk(yAk),  k  =  2,  3. 
3.  E(e1)  =  E(e2\y1)  =  E(e3\y1,y2)=0. 
Note  that  -ax  =  P(Y1  =  1)  and  that  the  notation  suggesting  that  f2  and  /3  are 
functions  of  parameters,  say  /^  and  /?2,  has  been  suppressed.  The  model  errors  are 
dichotomous  variables  with  distributions  defined  by  the  following: 
1  —  7Ti        withprobabilityivi 

— 7Ti      with  probability  1  —  tti 
and,  given  Yx  =  yu 

{1  -  /2(2/i)        with  probability  f2{yi) 
-f2(yi)      with  probability  1  -  /2(t/i) 
and,  given  Fx  =  yx  and  Y2  =  y2, 

1  -  f 3(1/1,  2/2)         with  probability  f3(yi,  y2) 
-fz(Vi,  2/3)      with  probability  1  -  f3(y1,  y2) . 
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6.2.1     The  COE  in  the  Three  Variable  Model 

In  this  section,  we  utilize  the  assumption  of  zero  conditional  expectation  for 
the  error  terms  in  the  above  three  variable  model  and  the  assumption  that  Yx,  Y2 
and  Y3  have  a  strictly  ordered  causal  relationship.  For  the  system  listed  above  in 
Equations  6.1-6.3,  we  see  that 

E(Y3\Y1=y1)    =  Ey^iEy^y^lY,  =  yx,  Y2)]  (6.4) 

=  EY2\Yl[h(yuY2)}  (6.5) 

=  Ee2\Yl[f3{yh  /2(yi)+c2)]  (6.6) 

=  /3(3/i,  1)  •  /2(2/i)  +  /3(2/i,  0)  -  (1  -  /2(2/i))-  (6.7) 

By  Definition  6.1.1  and  the  convention  for  k  =  1,  the  TE  of  Yx  on  Y3  is 

TE31    =    APlEn,yi(Y3|Y1  =  2/1)  (6.8) 

=    [EY3lYl(Y3\Y1  =  y1  +  h)-EY3lYl(Y3\Yl  =  y1)]/h  (6.9) 

=    EY3\Yl{Y3\Yl  =  l)-EY3\Yl(Y3\Y1=0)  (6.10) 

because  h  must  be  such  that  (yx  +  h)  e  Dyi,  that  is,  (yx  +  h)  E  {0,  1}.  Hence,  note 
that  h  must  be  -1  when  Yx  =  1  and  h  must  be  +1  when  Yx  =  0.  Also,  note  that, 
when  Yx  =  1, 


Ayi/(1)     =     /(l  +  *)-/(D 

/(0)  -  /(I) 
-1 

/(i)-/(Q) 
1 


Likewise,  when  Yx  =  0, 
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a,/(o)  =  ZfiLtfizM 

/(i)  -  /(Q) 

1 

Thus,  Ayif(yi)  does  not  depend  on  the  particular  value  of  Yi  =  y\. 
By  applying  Equation  3.6  to  Equation  6.7  above,  we  see  that 

Avi£(r3|li  =  m)    =   AvlMyul)Mvi)  +  Mm  +  h,l)&yiMvi)  (6.11) 

+  Ayi  /8(tfr,  0)[1  -  /2(yi)]  +  Mm  +  h,  0)  Aw  [1  -  f2(yi)} 
=    [/3(1, 1)  -  /s(0,  lJl/adh)  +  /3(l/,  4-  ft,  1)[/2(1)  -  /2(0)]   (6.12) 

+!/■(!,  o)  -  /3(o,  o)][i  -  Mm)]  -  Mm  +  h,  o)[/2(i)  -  /2(o)] 

=    EY2lYl[Ayif3(yl,Y2)\Y1  =  yl]  (6.13) 

+[Mm  +h,l)~  Mm  +  h,  0)][/2(l)  -  /2(0)] 
=    ^3|yx[/3(l,/2(j/i)  +  €2)-/3(0,/2(y1)  +  c2)]  (6.14) 

+[Mm  +  h,i)-  Mm  +  K  o)][/2(i)  -  /2(o)] 
=  E€%\Yl{&viMyi,Myi)  +  t2)\  (6.15) 

+  A«/2  /a (2/1  +  h,  2/2 )  •  AyiMyi) 
=    {[AyiMm,  l)]Mvi)  +  [A»l/5(yi,  0)][1  -  /2(yi)]}  (6.16) 

+  A„2  Mm  +  h,  ya)  •  &yMyi) 
=    DE31{yi)  +  DE32(yi  +  h)  DE21 ,  (6.17) 


that  is,  DE3l{yx)  +  DE32(yx  +  h)  DE2l  represents  the  sum  of  an  average  direct 
effect,  averaged  over  the  conditional  distribution  of  Y2  given  Yx  =  yx,  and  the 
indirect  effect  of  Yx  on  F3,  evaluated  at  Yx  -  yt  +  h. 
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Note  that,  since  AyiEY3\Yi(Y3\Yi  =  yi),  as  given  in  Equation  6.10,  does  not 
depend  on  Y\,  we  have  that  Equation  6.17  is  the  same  when  Yi  =  1  as  when  Y\  =  0. 
Both  terms  in  Equation  6.17  above,  however,  are  functions  of  y\.  So,  we  can  write 


the  TE  as  the  expectation  of  DE(Yl)  +  IE{Y\  +  h)  over  Yi.  Thus,  taking  the 
expectation  with  respect  to  Yi,  we  have 

TEzl    =    EyAI^vMYi,  1)]  h{Yi)  +  [AyJz(yi,  0)][1  -  f2(Yl)} 
+EYl[Ay2fz(Y1  +  h,  y2)\  Ayi  h{yi) 
=    {[/3(l,l)-/3(0,l)]/2(0)  (6.18) 

+[/3(l,0)-/3(0,0)][l-/2(0)]}(l-7r1) 
+{[/3(l,  1)  -  /3(0, 1)]  /2(1)  +  [/3(1, 0)  -  /3(0, 0)][1  -  h(l)]}«i 
+{[/a(0, 1)  -  /s(0, 0)]7n  +  [f3(l,  1)  -  /3(1, 0)](1  -  jn)}[/a(l)  -  /a(0)]. 

Note  that  the  quantity 

{[/3(1, 1)  -  /,(0, 1)]  /2(0)  +  [/3(1, 0)  -  /3(0, 0)][1  -  /2(0)]}(1  -  wi) 

+{[/3(i,  i)  -  /3(o,  i)]  /2(i)  +  [/,(i,  o)  -  /3(o,  o)][i  -  Mi)]}m 

in  the  first  part  of  Equation  6.18  represents  the  partial  effect  of  Yi  on  Y3  averaged 
over  Y2  given  Yx  and  then  over  Y\.  Likewise,  the  value  [/3(1,0)  -  /3(0, 0)]7Ti  + 
[/3(1, 1)  -  /3(0, 1)](1  -  7Ti)  denotes  the  partial  effect  of  Y2  on  Y3  averaged  over  Yi. 
The  final  quantity  shown  in  Equation  6.18  above,  [/2(1)  -  /2(0)],  represents  the 
effect  of  Yi  on  Y2.  Thus,  the  entire  quantity  in  Equation  6.18  can  be  written  as 
TE31  =  DE  +  IE  and  interpreted  as  the  "average  of  the  direct  effects  of  Yx  on  Y3, 
averaged  over  Y2  and  Yx"  plus  the  quantity  "average  of  the  direct  effects  of  Y2  on  Y3 
evaluated  at  Yx  +  h,  averaged  over  Yx,  multiplied  by  the  direct  effect  of  Yx  on  Y2". 
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Therefore,  we  can  now  write  a  COE,  as  applied  to  this  system  of  three 
dichotomous  variables.  The  total  effect  of  variable  Y\  on  variable  Y3  is 

TE3l    =    EYl{EY2lYl[Ayif3(yi,  Y2)}}  +  {EyAAyMY,  +  h,  y2)}}  Ayi  fM- 

This  is  analogous  to  the  COE  in  Theorem  5.5.1  which  derives  from  the  MVCR  in 
the  analogous  nonlinear  path  model  with  continuous  variables,  Yi,  Y2  and  Y3,  and 
to  the  COC  form,  /?31  +  fi32(32i,  in  linear  path  models.  Hence,  the  quantity 

TE31    =    EYl{EY2lYl[Ayih(y1,Y2)}  +  {EYl[Ay2f3(Yl,y2)}}Ayif2(y1) 


=    EYADEMfl  +  EYADEniYi  +  tyDEii] 

represents  a  generalization  of  the  COC  to  path  models  with  three  dichotomous 
variables. 

6.2.2     A  Second  Approach  to  the  Three  Variable  Model 

We  now  give  a  second  approach  to  the  derivation  of  a  COE  for  the  three 
variable  model.  Note  that  in  this  derivation,  as  well  as  the  one  immediately  above, 
a  causal  system  of  variables  is  presented  that  follows  only  two  assumptions:  (1) 
the  errors  have  zero  conditional  expectation;  and  (2)  each  variable  has  a  structural 
relationship  with  only  the  variables  occurring  before  it  in  the  causal  chain.  There 
are  no  assumptions  regarding  the  independence  of  error  terms  in  this  system.  In 
particular,  the  three  variable  dichotomous  system,  or  any  dichotomous  system  in 
general,  has  a  dependent  error  structure. 

To  calculate  the  quantity  TE3i,  we  write 
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TE31    =    AyiE(Y3\yi) 

=    Ayi£€2,f3(F3|2/i) 

=    Ayi  ^  J^[/3(!/i,  /2(2/i)  +  e2)  +  e3]  •  P(e2,  eslyi), 


f2         t3 


where  p(e2,  C3|2/i)  represents  the  joint  probability  mass  function  of  e2  and  e3. 
Because  -E(e3|2/i,  e2)  ^  0  and  /3  is  not  a  function  of  e3,  we  have  that 

TEZI  =  AVl  Y^  fsivu  /2(2/i)  +  e2)  ■  p{e2\yi). 

Note  that 

{1  -  72(j/i)        with  probability  f2{yx) 
-72(2/1)      with  probability  1  -  /2(j/i). 
Thus,  for  any  given  value  of  yx,  72(2/1)  +  e2  takes  on  values  (0,  1)  that  are 

functionally  independent  of  yx .  Hence, 

TE31    =  Ayi  Y  73(2/i,  72(2/i)  +  e2)  •  p(e2|j/i) 

=  AyJ/3(y1,0)p(-/2(y1))  +  /3(j/1,  l)p(l-/2(yi)) 

=  AVl{f3(y1,0)[l-f2(yl)]  +  f3(y1,  1)  72(2/1)} 

=  Ayi{f3(y1,0)[l-f2(y1)]}  +  Ayi{f3(y1,  1)  72(2/1)} 

=  [Ayi/3(yi,  0)]  [1  -  f2(yx)}  +  f3{yi  +  fc,  0)  Ayi  72(2/1)         (6.19) 

+[Ayi/3(2/i,  1)]  72(2/1)  +  73(2/i  +  fc,  1)  Ayi  72(2/i) 

=  I]{[Ayi 73(2/i,  72(2/i)  +  c2)]p(£2|yi)}  (6.20) 

^2 

+  ]C{73(2/i  +  /l>  /2(2/i)  +  €2)  Ayi  p(e2\yi)} 


C2 


where  the  first  summation  is  the  sum  of  the  first  and  third  terms  in  Equation  6.19 
and  the  second  is  the  sum  of  the  second  and  fourth  terms  in  Equation  6.19. 
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Note  that  each  term  of  Equation  6.19  is  a  function  of  j/i,  the  given  value  of 
Y\.  Taking  the  expectation  of  the  right-hand  side  of  Equation  6.19  with  respect  to 
Y\  gives  Equation  6.18.  Hence,  as  before,  we  have  the  COE  for  TE3i  in  the  three 
variable  model,  given  by 

TE31  =  EYl{EY2lYl[Ayif3(yi,  Y2)}}  +  Ey^AyJsiY,  +  h,  y2)}  Ayi  h{yx). 

Also,  note  that  within  each  summation  over  e2  in  the  equation  above,  e2 
is  a  fixed  quantity.  Thus,  the  functions  inside  the  summation,  when  written  as 
compound  functions  of  j/i,  are  only  functions  of  y\.  Consequently,  the  Ayi  operator 
is  a  full  difference  quotient  and  not  a  partial  difference  quotient. 

Again,  from  the  derivation  presented  in  the  current  section,  we  see  that  the 
Calculus  of  Effects  (COE)  analog  of  the  COC  as  applied  to  systems  of  dichotomous 
variables  is  achieved.  It  should  be  noted  once  more  that  this  result  is  achieved 
without  the  assumption  of  mutual  independence  among  the  ek  terms  because  the 
dichotomous  variables  system  is  a  special  case  of  a  system  with  a  dependent  tk 
terms.  Another  point  to  note  is  that  this  derivation  shows  a  specific  case  where  the 
COE  still  holds,  even  when  the  second  term  in  the  dichotomous  analog  of  Equation 
5.33  does  not  equal  zero. 

6.3    Models  with  Four  Variables 
To  address  complications  that  arise  when  analyzing  causal  systems  involving 
more  than  three  variables,  we  now  present  the  COE  for  a  four  variable  model.  This 
model  is  identical  to  the  model  presented  for  the  three  variable  case  in  Section  6.2 
and  follows  the  same  assumptions,  with  the  addition  of  a  fourth  variable  in  the 
causal  chain  and  the  related  assumptions  for  the  fourth  variable.  That  is,  we  write 
the  last  variable  in  the  causal  chain  as  Y4  =  /4(Yi,  Y2,  Y3)  +  e4.  Furthermore, 
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we  assume  that  F4  given  Y^4  is  a  Bernoulli  random  variable  with  mean  function 
/4(YAJ  and  that  E{e4\yAi)  =  0. 

Consider  the  quantity  TE^\.  We  write 

E(Y4\yi)    =    ^^^(/4  +  e4).p(e2)e3,  £4|yi)  (6.21) 

E2         £3         «4 

=    ^2^2^2(h  +  £*)  •  P(t4\yi,  e2,  e3)  -p(e3,  t2\yi).         (6.22) 

£2         «3         U 

Note  that  the  above  probability  mass  functions  are  written  as  functions  of 
6fc,  k  =  2,  3,  . . . ,  p,  conditional  on  j/i  and  all  previous  e^,  k'  =  2,  3,  . . . ,  p  — 
1.  Conditioning  on  y\  and  the  previous  e^  is  equivalent  to  conditioning  on  all 
previous  endogenous  variables,  as  was  presented  in  the  model  assumptions.  This 
conditioning  equivalence  is  due  to  the  fact  that  having  a  given  value  for  any 
particular  e^  implies  that  we  also  have  a  specified  given  value  for  the  associated 
yk>.  As  an  example  consider  e2.  If  we  know  that  e2  =  1  -  f2{yi),  which  occurs  with 
probability  f2(yi),  then  this  means  that  Y2  =  1  because  Y2  =  f2(yi)  +  e2.  Likewise, 
if  we  know  that  e2  =  -f2(yi),  which  occurs  with  probability  1  -  f2{y\),  then  we 
know  that  Y2  —  0. 

Hence,  since  E(e4\yi,  e2,  e3)  =  0  by  assumption,  we  write  Equation  6.22  as 

^Q^lyi)    =    ^^2*52  f4  ■  p(t4\yi,  €2,  e3)-p(e3|t/i,  e2)  ■  p(e2\yi) 

(2         «3         H 

=    ^2^Zf4-p{ez\yue2)-p{e2\y1). 

(■2         E3 

Now,  from  the  equation  immediately  above  and  the  fact  that  TE4l  is  defined  to  be 
AyiE(Y4\yi),  we  write 
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TE41    =    Ayi[^2^2f4p(e3\yi,e2)p(e2\yi)] 
=    ^2^2{^vAfM£3\yu^)p{e2\yi)]} 

since  A  is  a  linear  operator. 

We  use  the  shorthand  notation  of  /3  to  denote  that  f3  is  written  as  a  com- 
pound function  of  yu  that  is,  /|  =  f3(yi,  f2{yi)  +  e2).  Also,  note  that  f2  +  e2  and 
f3  +  e3  are  not  functionally  dependent  on  y\.  That  is,  Y2  =  f2  +  e2  and  Y3  =  /3  +  63 
are  fixed  at  0  or  1,  regardless  of  the  particular  value  of  Yi  =  yv  Hence,  for  any 
given  values  of  e2  and  e3,  Y2  and  Y3  are  fixed  and  not  compound  functions  of  yt 
and,  thus,  for  specific  values  within  the  sum  above,  partial  difference  quotients  are 
equivalent  to  full  difference  quotients.  Therefore,  using  the  product  rule  inside  the 
summation,  we  have 


TE^    =    YHL,tiAyifAp(tz\yi,£2)p(e2\yi)  (6.23) 

+/4(2/i  +  h,  f2(yx)  +  e2,  /3c(2/i)  +  e3)  Ayi  [(p(e3|j/i,  e2)  p(e2\yi)]}  . 

Applying  the  product  rule  once  more  to  the  A  notation  in  the  second  line  of 
Equation  6.23  above,  we  write 


TE^    =    EEtt^Abfoljft,  e2)p(e2\yi)  (6.24) 

E2         ^3 

+/4(2/i  +  h,  f2(yi)  +  e2,  /3  +  €3)\p{e2\yi)  Ayi  p(e3|yi,  e2) 
+P(e3\yi  +  &i,  «a)  Ayi  p(e2\yi)]}. 

We  distribute  the  summations  and  obtain 
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TEil    =    $ZI][A»i/4]p(e3|j/i,  c2)p(e2|yi)  (6.25) 

«2         «3 

+  ^2^2f4(yi+hu  f2{yi)  +  e2,  f3  +  e3)p(e2\yi)  Ayip(e3\yi,  e2) 
+  5Z5Z^(2/l  +  /l1'  /2(2/i)  +  e2,  /3C  +  e3)p(€3|2/1  +  /il7  e2)  Ayi  p(e2\yi) 
Note  that,  using  the  first  line  of  Equation  6.25  above,  we  can  write 


EE^J^Ii/i,  e2)p(e2\yi)  =  Ei2>£3lYl(Ayif4).  (6.26) 

E2         «3 

Next,  we  consider  the  quantity  in  the  second  term  on  the  right-hand  side  of 
Equation  6.25  above  and  write 


Yl  J2  &&  +  h,  fiiVx)  +  C2,  f3  +  e3)  ■  p{e2\yi)  •  Ayip(e3\yi,  e2) 

f2         «3 

=    ^2p(e2\yi)[f4(yi  +  hu  f2(yi)  +  e2,  1)  ■  Ayip(l  -  f3  \m,  e2)\  (6.27) 

+  ^2p{e2\yi)[h(yi  +  fci,  /2(yi)  +  e2,  0)  •  AyiP(-f3  \Vl,  e2) 
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where  p( l-/3  1^,  e2)  =  f3{yu  f2{yi)  +  e2)  and  p(-/3  ||/i,  e2)  =  1  —  /3 (2/1 ,  /2(|/i)  +  e2). 
Hence,  we  have  that 

A!/iP(l  -  /a  |2/x,  e2)  =  Ayif3(yi,  f2(yi)  +  e2) 
and 

&viP(-h  Is/x,  e2)  =  -  Ayi  f3(yi,  f2(m)  +  e2) . 
We  can  now  write  Equation  6.27  above  as 
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^Pfalvi)  Sa{V\  +  h,  /2(t/i)  +  e2)  1)  Aw  /3C 

«2 

-    5^p(e2|»i)/4(»1  +  /h,/2(»i)+ea,0)  Ayi/3C. 

«2 

Noting  that 

/4(2/i  +  hu  f2(yi)  +  e2,  1)  -  /4(j/i  +  /ii,  /2(2/i)  +  c2,  0) 
=    AM/4(j/i  +  /ii, /2(j/i)  +  e2,  y3), 

we  write  the  second  line  of  Equation  6.25  as 

Y^  Aw/4(yi  +  /ii,  /2(yx)  +  e2,  2/3)  A„,  /3C p(e2|j/i). 
(-2 
Thus,  we  have  shown  that  the  second  line  of  Equation  6.25  can  be  written  as 

E€3\Yl[&ysh{Vi  +  hi,  h{yi)  +  e2,  y3)  ■  AJ3I .  (6.28) 

Lastly,  examining  the  third  term  of  Equation  6.25,  we  write 


•  A!/1p(e2|y1) 
=    ]DAtflp(£2|yi)E/4(s/i  +  fcl,  /a(Vi)+«a,  /3c  +  e3)  (6.30) 

«2  E3 

•pfolyi  +  fti,  £2)]- 

Hence,  using  expectation  notation,  we  write  the  third  term  of  Equation  6.25 
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Y^[&yip{e2\yi)]Et3lYui2[f4(yi  +  hijziyi)  +  e2,  ft  +  £3)\Yi  =  2/i  +  &i] 
=    ^[AyiP(e2|yi)]  (6-31) 

■^e8|y1,e,[^n|y1|€a,ei(n|yi  =  2/i  +  h,  Y2  =  f2(yi)  +  e2,  Y3  =  /3C  +  e3)] 
=    ^[A^e^)]  ey<\yUz2(Ya\  yi  +  hu  f2(yi)  +  ea)  (6.32) 

E2 

=    AyJ2{yi)  Ay2  £y4|Kl,£2(Ki|  yx  +  hi,  y2  -  fiivi))  •  (6.33) 

Next,  note  that  the  quantity  Aj,2.EV4|y1)e2(Y4|yi  +  hi,  y2  —  f2(yi))  in  Equation 
6.33  immediately  above  can  be  written  as 

Av2^K,|yi,e2(^4|2/i  +  hu  y2  -  f2(yi)) 
=    A»E/i(»i  +  ftii  2/2,  /3c  +  C3)p(e3|2/i  +  /ii,  ea)] 

=    S^A^[/4(i/i  +  fci,  2/2,  /3C  +  e3)p(e3|2/i  +  h,  e2)} 

«3 

=    5Z[Ay2/*(»i  +  ^1.  2/2,  /3  +  e3)]p(e3|2/i  +  fci,  2/2  -  .Ms/i)) 

«3 

+  J^  /4(2/i  +  huy2  +  h2,  /3C  +  e3)  Aw  p(e3|yi  +  huy2-  /2(2/i)) 

E3 

=      Ee3\Yue2  [Ay2/4(2/l  +  /il,  2/2,  /3  +  «3 )  | >1  =  2/l  +  &l] 

+  Aj/3  /4(2/i  +  hu  y2  +  h2,  yz)  Ay2  fz(yx  +  h,  y2) . 
Therefore,  the  last  term  in  Equation  6.25  can  be  fully  expanded  and  written  as 

A^iY.^tnij/i  +  h,  y2)  Ayi  f2(yi)  (6.34) 

=    ^3|k., €2^/4(2/1  +  hu  y2,  /3C  +  e3)]  Ayi  f2(yi) 
+  Ai/3  /4(2/i  +  h,  y2  +  h2,  2/3) 
•  Ai/2  Mvi  +  h,  y2)  Aw  /2(i/i). 


98 

Now,  using  the  expansions  given  above  and  combining  Equations  6.26,  6.28 
and  6.34  to  fully  expand  Equation  6.25,  we  write 

TE41    =    AvlE(Y4\yi) 

=    Eeate3\Yl{AyJA(yi,  hivi)  +  £2,  ft  +  es) 

+Ee2[Yl[&y3h{yi  +  h,  f2(yi)  +  e2,  y3)  Ayi  f3{yu  f2{yi)  +  e2)] 

+^£3|y1,e2[Aj/2/4(2/i  +  h,  V2,  ft  +  c3)]  Ayi  f2(yi) 

+  Ay3  fA(yi  +  hi,  y2  +  h2,  y3)  Ay2  fz{yx  +  hu  y2)  Ayi  /2(t/i) . 

This  formulation  represents  the  COE  for  the  four  variable  dichotomous  path  model. 
Note  that  the  quantity  TEA\  above  can  also  be  expressed  as 


TE41    =    DE41(yi)  +  DE43DE31(yi)  +  DE42(yu  y2)  DE21 
+DE4S(y1,y2)DE32(Vi)DE21 


DE41(Vl)  +  /£43i(i/i)  +  IE42i(yi,  y2)  +  IEmiiyi,  2/2)        (6.35) 


where  each  DEjji,  j  >  f,  represents  an  average  direct  effect,  averaged  over  all 

variables  intermediate  to  variables  Y,  and  Yji .  Thus,  we  have  derived  a  method  to 

estimate  and  represent  the  direct  and  indirect  effects  in  a  four  variable  dichotomous 

path  model  that  is  analogous  to  the  representation  that  only  previously  applied 

to  path  models  with  continuous  and  linearly  related  variables.  Note  that  the 

formulation  and  representation  of  the  COE  given  above  in  Equation  6.35  is  directly 

analogous  to  the  COC  formulation  for  continuous  and  linear  path  models  shown  in 

Equation  4.35  in  Section  4.1.5  where  the  quantity  TE4i  was  expressed  as 

TEu  =  ^  +  ^^  +  9J±^l  +  ?h?kSl  (6  36) 

dyi      dy3dyi      df2dyx      dy3df2dyx 

The  TE  in  Equation  6.36  can  also  be  specified  as 
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TE41    =    DE4l  +  DE43DE31  +  DE42DE21  +  DE43DE32DE21         (6.37) 

=     041+043031+042021+043032021  (6-38) 


or,  equivalently  written  as 


TE41  =  DE4l  +  DE4ZDEZI  +  DE42DE2l  +  DE43DE32DE21.  (6.39) 

The  equivalence  between  Equation  6.36  through  Equation  6.39  holds  because,  for 
example,  in  the  case  of  continuous,  linear  path  models 


DE41     =     —E^(3[f4{yuY2,Y3)} 

w     ra/4(yi,  y2,  y3)1 

"       f2'63[ dy~i ] 

dyi 

=      /?41 

=    DE41. 


Likewise,  in  the  four  variable  nonlinear  case  considered  earlier  in  Section  5.3.1,  the 


quantity  DE4i,  as  seen  in  Equation  5.20,  is  expressed  as 


■  df4 


DE4l  =  Et2^3{-—). 

Thus,  we  see  that  the  COE  derived  in  this  section  can  be  expressed  in  similar  form 
as  the  COC  for  classical  path  models  with  mutually  independent  errors,  as  well 
as  the  COE  for  nonlinear  path  models,  making  these  three  formulations  directly 
analogous. 
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6.4    The  General  Model  with  p  >  3  Variables 
We  now  generalize  the  above  results  to  the  case  of  p  >  3  dichotomous 
endogenous  variables  in  the  causal  chain.  More  specifically,  we  consider  the  case  of 
formulating  and  estimating  the  conditional  TE  of  variable  Yk  on  variable  Yi,  for  any 
k  and  /  such  that  1  <  k  <  I  <  p.  That  is,  we  are  interested  in  the  quantity  TE[k\Ak- 
We  write  the  p- variable  model  as 

Fl     =     TTi+6!  (6.40) 

Y2    =    /2(n)  +  e2  (6.41) 

Yp    =    fp(Yl,Y2,...,Yp-1)  +  ep.  (6.42) 

Note  that  an  implied  assumption  of  this  model  is  that  the  variables  are  strictly 
ordered  in  the  sense  that  each  variable  is  a  function  only  of  previous  variables 
in  the  causal  chain.  To  complete  the  specification  of  this  model,  we  make  the 
following  additional  assumptions: 

1.  Y\  is  distributed  as  a  Bernoulli  random  variable  with  mean  K\. 

2.  Yk,  given  Y^fc ,  is  distributed  as  a  Bernoulli  random  variable  with  mean 
fk(YAk),  k  =  2,  3,  ...,p. 

3.  £(Cl)  =  E(e2\yi)  =  ■■■  =  E(ep\yAp)  =  0. 

Without  loss  of  generality,  we  consider  the  case  where  /  =  p,  that  is,  variable  Yt 
is  the  last  variable  in  the  causal  chain,  realizing  that  any  variables  occurring  after 
the  variable  of  interest,  Yj,  can  be  averaged  out  as  was  shown  in  Section  5.5.2.  We 
now  prove  the  existence  of  a  COE  for  TElk\Ah,  that  is,  the  total  effect  of  Yk  on  Yt 
conditional  on  YAk  and  Yk,  given  a  system  of  dichotomous  endogenous  variables  as 
defined  in  the  model  above.  The  following  theorem  is  analogous  to  Theorem  5.5.1 
which  gives  a  COE  for  nonlinear  models  with  continuous  endogenous  variables. 
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Theorem  6.4.1.  (Calculus  of  Effects)  Given  the  p-variable  path  model  defined 
by  Equations  6. 40-6. 42,  the  conditional  total  effect  TElk\Ak,  for  I  and  k  arbitrarily 
chosen  such  that  1  <  k  <  I  <  p,  is  the  sum  of  the  expectation  of  direct  and  indirect 
effects  with  respect  to  the  density  of  Y/  given  Y  Ak  and  Yk .   That  is, 


TElk]Ak    =    E[DElk}+     Y,      E[IEl(Q)k] 

Q€2Bik-$ 

=    E[AykMyAh,  yk,  Y,)]  +     £      E[      J]       AVk,  fv  (y^)] 

Q£2Bik-%         (V  ,V)el{Q)k 

where  the  expectation  in  the  first  term  is  taken  with  respect  to  the  e*  terms  that  are 
intermediate  to  t\  and  ek,  that  is,  et,  the  expectation  in  the  second  term  is  taken  with 
respect  to  the  set  of  all  t{  's  that  are  intermediate  to  ei  and  ek  with  subscripts  not  in 
Q  andy*Ai  is  as  defined  in  Definition  6.1.3. 

Proof.  We  write 


TElk\Ak     =    AyJi{yAk) 

=    AykE(Yl\YAk=yAk,Yk  =  yk) 

=   ^^Y.iY^.nl^yiiv^.n.Y/i/Ky^,  yk,  Y7)  +  e,]} 

by  the  laws  of  conditional  expectations.  Also,  since  E(ei\yAk,  yk,  y7)  =  0,  the  line 
immediately  above  reduces  to 

TElk{Ak    =    AykEYl\YAk,Yk[MyAk,  Vk,  Y7)]. 
Rewriting  this  equation  and  taking  expectations  with  respect  to  e7,  we  have 
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TElklAk  =  Ayk^2fi(yAk,  yk,  ff  +£)p(ej\yAh,  yk) 

where  p{ej\yAk,  Vk)  represents  the  joint  probability  mass  function  for  the  e7,  that  is, 
all  error  terms  intermediate  to  Yk  and  Yj  in  the  causal  chain.  Therefore,  since  A  is 
a  linear  operator,  we  can  switch  the  order  of  operations  and  write 

TElk]Al    =    Y,AvMyAk,yk,  ?i  +  <Li)p(ei\yAh,Vk)]  (6.43) 

=  EtA^(y^'  y*>  {'i +&]p&\yAk>  v*)  (6.44) 

+  J2 -^y^'  y* +  hk' f/  +  &)  [AvkPki\yAk,  yk)}, 

u 
via  the  product  rule  of  the  A  operator. 

Note  that  the  first  line  of  Equation  6.44  above  can  also  be  written  as 

Eej\YAk,Yk[Aykfl(yAk,  Vk,  f/  +  £/)]• 

Now,  noting  that  the  quantity  p(eI\yAk,  yk)  in  the  second  line  of  Equation  6.44 
represents  the  joint  probability  mass  function  of  ek+i,  efe+2,  . . . ,  ej_i,  given  Y Ak  = 
yAk  and  Yk  =  yk,  and  since  conditioning  on  a  specified  e  term  is  equivalent  to 
conditioning  on  the  associated  Y  random  variable,  this  quantity  can  be  written  as 

pU/Iyai,  Vk)  =  p(d-i|yA»,  Vk,  Vk+i,  ■■-,  yi-2)p(Yi-2,  *l-3,  . . . ,  Yk+1\yAk,  yk). 
Thus,  using  the  A  operator,  we  have 
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=    [AvkP(ei-i\yAk,  Vk,  Vk+i,  ■■■,  yj_2)]p(ej-2,  e«-3,  ■  •  • ,  efc+ily^,  y*) 

+p(«i-i  liM* ,  y*:  +  hk,  Vk+u  •  •  •  i  yj-2)  Ayfc  p(e;_2,  C|_3)  . . . ,  efc+1|yAfc,  yfc). 

Similarly,  the  quantity  p(ej_2,  ej_3,  . . . ,  efc+1|yy4fc ,  j/fc)  can  also  be  written  as 

p(ei_2,  c;-3,  •  •  • ,  tfc+ily^,  yfc)    =    p(ci_2|yJ4fc,  yfc,  yk+i,  ■  ■  • ,  yi-3) 

p(ei_3,  e/_4,  ...,  ek+i\yAk,  yk). 

Then,  again  using  the  A  operator,  the  quantity  Aykp(ei_2,  ej-3,  ■  •  ■ ,  efc+i|yAfc,  Vk) 
can  be  further  expanded.  By  repeatedly  performing  this  expansion  one  variable  at 
a  time,  we  can  write  the  second  term  in  Equation  6.44  above  as 
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^2fi{yA„,  Vk  +  h,  %  +  e'7)  Ayk  p(ej\yAk,  yk)  (6.45) 

=    5^/i(y^,lflk  +  fc*,«J  +  €i)  (6.46) 

•[AyfcP(ei-i|y^>  2/fc,  y*+i,  ■  •• ,  !A-2)]p(q-2i  e;-3,  •  ■  ■ ,  efc+i|y,ifc,  Jfe) 
+  2J{/j(yA4,  Vk  +  hk,  f}  +  e/)p(q_i|yAfc,  Vk  +  hk,  yk+i,  ...,  y,_2) 

i/ 

•[A„fcp(cj_2|yAfc,  Sfc,  2/fc+i,  •  •  • ,  j/i-3)]p(e/-3.  Q-4,  ■  ■  • ,  efc+i|y^fc,  j/fc)} 

+  5^{//(y>ifc,  J/*  +  &*,  f/  +  e/)p(ej-i|yAfc,  yk  +  hk,  yk+i,  •  •  • ,  j/i-2) 

•p(ei-2|yAt,  2fc  +  &*,  2/fc+i,  ■  •  • ,  y;-3)[Ayfcp(e,_3|yAfc,  j/fc,  yk+1,  ...,  yt_4)} 
•p(cj-4,  ej-5,  ■  ■  ■ ,  efc+ily^,  y*)}  + 

+  2J/i(y^i  Vk  +  hk,  ff +  ej)p(ej_i|y>i|k,  yk  +  h,  yk+1,  ...,  yt_2) 

•p(ei-2\YAk,  yk  +  hk,  yk+l,  . . . ,  yt_3) 

■  •  •p(ek+2\yAh,  Vk,  yk+i)  Aw  p(ek+i\yAk,  yk). 

Therefore,  considering  the  first  term  in  the  expansion  of 

Yl My*-*  yk  +  hk,  f/  +  £/)  aw p(&\yAh,  yk) 

in  Equation  6.45  above,  we  can  rewrite  Equation  6.46  as 
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^2fi(yAh,yk  +  hk,  ff +  C/)[Awp(e,_i|yylfc,j/fc,yfc+1,  ...,2/»_2)]         (6.47) 
it 
•p(cj_2,  ej_3,  .  ...efc+ily^.j/jk) 

=         5Z      He/-2,  ej-3,  ■  ■  ■ ,  efc+i|yAfc,  J/*)  (6.48) 

•E  /»(y*.  2/fc  +  fe*» f/  +  £/)  A»*  p(£/-i|yAfc,  y*,  Vb+ii  ■  •  •  i  »-a)] 

H      P(e'-2>  e'-3'  ■  •  • '  £*+i|yAfc,  ft)  (6.49) 

ei-2i—  >«t+i 

■[fl{yAk,  Vk  +  K,  yk+1,   ...,  €;_!  =  1  -  /,_x) 

•  Aw  p(ej_i  =  1  -  fi-i\yAk,  Vk,  ft+i,  ■■■,  2/1-2) 
+fi(YAk,  Vk  +  hk,  yk+u  ...,  e/_!  =  -fi-x) 

■  Ayk  P(*l-1  =  1  -  fl-l\yAh,  Vk,  ft+i,  ■■■,  Vi-2)} 

J2      p^-2'  e'-3'  •  ■  • '  efc+i|y^5  Vk)  (6.50) 

•  Att_,  fi(yAk,  Vk  +  h,  yk+1,  ...,  yx_x) 

■  Ayfc  fi-i{yAk,  Vk,  ft+i,  ■■-,  2/1-2) 

=     #ei-2,e«-3,...,eM.1[Aw_1/j(yA,  yfc  +  /ifc)  ^+1)   .  .  .  j  y,^)  (6.51) 

•  Ayfc  fi-i(yA,  Vk,  Vk+i,  ■  ■  • ,  2/1-2)] 

=     £'ei-3,ei-3,...,efc+i[^(i-l)ik]-  (6.52) 

Each  of  the  remaining  terms  in  the  expansion  of  Equation  6.45  can  also  be 
expanded  using  the  methodology  presented  both  above  and  in  Sections  6.2  and  6.3, 
resulting  in  a  conditional  COE  for  the  General  Dichotomous  Model,  conditional  on 
the  antecedent  variables,  YAk  =  y^and  Yk  =  yk.  The  full  expansion  COE  is 

TElk{Ak=  J2    E[       ft        Ayklfll(yAki)} 

Qe2*  (k',l')el(Q)k 

D 
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Noting,  as  occurred  in  the  continuous  case  considered  in  Section  5.5,  that  the 
above  formulation  would  yield  a  different  set  of  effects  for  each  unique  value  of  Y^, 
we  recommend  averaging  over  these  variables  which  yields  a  COE  for  path  models 
with  dichotomous  variables  written  as 

EyAk(TElklAk)    =    EYAk{Y,   E[      J]       A^/^y^,)]} 

Qe2B        (k',i')ei(Q)k 

and  represents  the  average  (over  Y^J  of  the  TE  of  Yk  on  Yt  for  each  value  of 
Y,4fc  =  YAk-  Note  again  that,  for  predictive  purposes  or  to  study  subject-specific 
effects,  the  COE  formulation,  before  averaging  over  Y^,  must  be  viewed  as  a  COE 
conditional  on  each  specific  value  of  Y^,,  and  interpreted  as  such. 

Note  also  that,  as  with  models  containing  continuous  endogenous  variables, 
there  will  again  be  mC0  =  1  "zero-th"  order  IE  terms  (or  direct  effect  terms). 
There  will  be  mCi  =  m  first  order  IE  terms,  as  well  as  mC2  second  order  IE  terms, 
etcetera,  until  there  is  only  mCm  =  1  rath  order  IE. 

Both  Theorem  6.4.1  and  Definition  6.1.3  above  require  notational  intricacies 
that  must  be  specified  and  discussed.  First,  it  should  be  noted  that,  in  Theorem 
6.4.1,  the  expectation  is  taken  with  respect  to  all  variables  that  are  intermediate 
to  Yt  and  Yk  in  the  causal  chain.  However,  the  final  formulation  of  the  TE  results 
in  expectations  of  IEs  where  the  expectations  are  taken  only  with  respect  to  the 
variables  that  are  intermediate  to  Yt  and  Yk  in  the  causal  chain  and  not  included 
in  the  specific  IE  of  interest.  Also,  as  shown  in  the  definition  of  an  IE  (Definition 
6.1.3),  it  should  be  noted  that,  when  writing  the  IE  as  a  product  of  difference 
quotients,  any  variable  included  in  the  causal  chain  that  is  antecedent  to  the 
specified  Yk<  variable  and  is  in  the  IE  chain  of  interest,  say,  for  example,  variable 
Yfc»,  must  be  evaluated  at  Yk»  =  yk„  +  hk„  within  the  /,/  function  of  interest,  as 
opposed  to  being  evaluated  at  Yk»  =  yk„ . 


CHAPTER  7 
MODELS  WITH  BOTH  CONTINUOUS  AND  DICHOTOMOUS  VARIABLES 

In  this  chapter,  causal  systems  with  both  continuous  and  dichotomous 
variables  will  be  considered.  Herein,  we  present  special  case  studies  of  various 
models  containing  continuous  and  dichotomous  variables  simultaneously.  More 
specifically,  we  will  consider  systems  similar  to  the  system  given  by  Equations 
4.37-4.39,  where  the  functions  fk  are  not  necessarily  linear  and  some  of  the  Vs  are 
dichotomous. 

7.1     Multiplicative  Effects  and  the  Dichotomous-Continuous-Dichotomous 

Variables  Case 

In  this  section,  we  consider  some  special  cases  of  combinations  of  continuous 
and  dichotomous  variables  within  a  causal  system.  For  these  particular  special 
cases,  we  develop  a  useful  alternative  definition  of  a  TE.  This  particular  definition 
is  extremely  useful  in  situations  where  it  is  convenient  and  practical  to  formulate 
the  TE  quantity  as  a  relative  risk  ratio. 

Further  generalizations  and  methodology  will  be  developed  that  will  apply 
these  concepts  to  causal  chains  involving  variables  (discrete  or  continuous)  in  a 
nonlinear  fashion.  In  these  cases,  alternate  definitions  of  "effects"  (direct  and 
indirect),  as  opposed  to  the  previous  definitions  presented  earlier,  may  be  useful 
and  appropriate.  For  example,  it  may  be  beneficial  to  view  the  "effects"  as 
relative  risks  in  some  cases.  To  this  end,  we  give  an  alternate  definition  of  TE. 
This  alternate  formulation  of  TE  is  useful  and  practical  for  models  involving 
dichotomous  variables  as  the  initial  and  ending  variables  within  the  causal  chain, 
with  continuous  intermediate  variables  .  We  refer  to  this  alternate  TE  as  a 
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conditional  multiplicative  TE  and  denote  this  multiplicative  effect  of  Yk  on  Yj 
as  TE*k,Ak.  The  formal  definition  follows,  using  notation  presented  in  previous 
chapters. 

Definition  7.1.1.  The  multiplicative  total  effect  of  a  dichotomous  variable  Yk 
on  Yi  is  denoted  by  TEfk,A    and  defined  to  be 

TF*  E(Yl\YAk=yAk,Yk  =  l) 

mAk      E(Yl\YAk=yAk,Yk  =  0Y 

Note  that  if  A;  =  1,  then  there  are  no  antecedent  variables  and  TE*k  is  an 
unconditional  multiplicative  effect. 

7.1.1     The  p  Variable  Model 

We  now  consider  the  multiplicative  TE  in  the  context  of  the  following  p 
variable  model: 


Yi    =    TTi-r-d  (7.1) 

Y2    =    /aPD  +  ea  (7.2) 

5^3    =    ?73  +  e3  (7.3) 

Yp-i    =    ?7p_i  +  ep_!  (7.4) 

Yp    =    exp(r/p)  +  ep  (7.5) 

where  r)k  =  Y'Ak(3_k,  k  =  3,  4,  . . . ,  p.  The  assumptions  for  this  model  are: 
1.  Ki  is  distributed  as  a  Bernoulli  random  variable  with  mean  7rx. 
2-  Y2,  Y3,  . . . ,  Yp_i  are  continuous  random  variables. 
3.  Yp  is  distributed  as  a  Bernoulli  random  variable  with  mean  exp(r/p). 
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4.  efc,  k  =  1,  2,  ...,  p,  has  conditional  expectation  of  zero,  conditional  on  all 
previous  endogenous  variables  in  the  causal  chain. 

5.  efc  and  ty  are  mutually  independent  V   k  /  k',  k,  k'  =  2,  3,  . . . ,  p  —  1. 

6.  ei  and  C2>  £3,  •  •  • ,  £p-i  are  independent. 

Note  that  assumption  5  above  does  not  require  that  ep  be  independent  of  all 
previous  error  terms  and  endogenous  variables.  In  fact,  given  the  assumption 
of  a  Bernoulli  distribution  for  Yp,  ep  will  not  be  independent  of  previous  errors. 
Note  also  that  assumptions  1-3  above  indicate  that  we  are  studying  the  gen- 
eral dichotomous-continuous-dichotomous  variables  case  with  a  chain  of  p  —  2 
intermediate  continuous  variables. 

To  study  the  TE  of  Y\  on  Yp,  we  utilize  Definition  7.1.1  above  and  write 

TF*       E{YP\YX  =  1) 
pl      £(^=0) 

where 

E(Yp\Yx  =yi)=   /   exp(77P)i?(e/|2/1)de/. 

Since  the  first  through  the  (p  -  l)th  error  terms  are  mutually  independent  and  have 
conditional  expectation  of  zero,  the  quantity  g(ej\yi)  is  independent  of  the  specific 
value  of  Yi  =  yu  and,  hence,  we  have 

E{YP\YX  =  1) 

E{YP\YX=Q) 

4  exp(r/p|F1  =  lMe^de, 

4exp(»j,|Yi  =0)^^)^/ 

Writing  out  the  integrals  and  canceling  like  terms  in  the  numerator  and  denomina- 
tor, we  have 
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TE*pl    =    exp[/3pl  +  pp2  Ayi  f2(yi)  +  pp3p31  +  ■  ■  •  +  Pp,p-iPp-i,i 
+/3p3/532  Ayi  f2(yi)  +  ■■■  +  Pp,p-iPP-i,2  Ayi  f2(ui) 
H        I- Pp,P-iPP-i,p-2 '"P31  +  Pp,P-iPP-i,p-2 ■  •  •  P32  Ayi  /a(yi)J 
=    exp[DEpl  +     Yl      IEKQ)k] 

Q62B'*-0 

=    DEpl  ■      Yl     IEi(Q)k 

Qe2Bik-Q 

where  2B'k  denotes  the  power  set  of  Bik,  Q  denotes  any  arbitrary  element  of  2Blk, 
and  l(Q)k  represents  the  set  {1}  U  Q  U  {k}.  Therefore,  TEpl  can  be  expressed  as 
a  multiplicative  Calculus  of  Effects  (MCOE).  The  resulting  format  for  TE?k  can  be 
written  as 

te;1  =  eYiIYi[1[[      J]     DEr»]-  (7-6) 

QE2B    (k',l')£l{Q)k 

Note  that  the  model  above  allows  for  any  arbitrary  specification  for  f2.  That 
is,  f2  need  not  be  linear.  If  /2  is  specified  to  be  a  linear  mean  function,  then 
testing  the  significance  of  all  direct  and  indirect  effects  can  be  done  as  presented 
in  Theorem  5.5.2.  When  /2  is  some  unspecified  nonlinear  mean  function,  tests  of 
significance  of  direct  and  indirect  effects  not  involving  variable  Y2  can  be  performed 
as  in  Theorem  5.5.2.  However,  methodology  to  test  significance  of  any  indirect 
effects  that  contain  variable  Y2  in  the  causal  chain  is  more  complicated  due  to  the 
nonlinearity  of  Y2  and  will  be  investigated  in  future  research. 

7.1.2     Example  of  MCOE  Using  Three  Variables 

We  now  present  the  multiplicative  TE  in  the  context  of  the  following  three 
variable  model: 
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n    =    iri  +  d  (7.7) 

y2    =    ^M  +  ^n  +  ea  (7.8) 

F3    =    exp(&o  +  &in  +  /?32r2)  +  e3.  (7-9) 

Here,  Yi  and  F3  represent  dichotomous  variables  and,  thus,  F2  is  the  only  contin- 
uous variable  in  the  causal  chain.  Also,  recall  that  we  make  the  assumptions  that 
E{e\)  —  E(e2\yi)  =  E(e3\yi,  e2)  —  0  and  that  e2  is  independent  of  Y\. 
To  study  the  TE  of  Y\  on  F3,  we  write 

^.esO'sIyi)    =     /  ^{^Mfco  +  fcm  +  faifoo  +  fam  +  £2)]  +  es} 
=    e^+^°  f  exp[/?3l2/l  +  MlhiVi  +  <*)]  <Ke2|j/i)  <*e2 

+  ^2^g(e3\yi,  (-2) 


f-i 


__        gfoo+fe/^O 


I  exp[^31y!  +  ^32(^212/1  +  £2)]  tftall/i)  dc2 


because  e3  has  a  conditional  expectation  of  zero. 
Hence,  using  Definition  7.1.1  above,  we  write 


TEk        J5(n|y,  ==  1) 


31        £(f3|f1  =  o) 

/ea  exp[031  +  /332(/?2i  +  e2)]  </(e2|yi  =  1)  de2 


Je,exp\fi32e2)]g(e2\Y1  =  0)de 


2 


exp(/?31  +  /?32/?21)  /   exp(/332e2)^(e2|F1  =  1)  de2 


/ea  exp(/332e2)p(e2|Fi  =  0)  rfe2 
exp(Asi  +  gngn)  /ea  exp(jg32e2)(/(e2)  de2 
/e2  exp(/?32e2)#(e2)  de2 
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since  5(e2|li  =  1)  =  ^(e2|yi  =  0),  i.e.,  e2  is  independent  from  Yx.  Therefore,  we  can 
also  write  the  above  result  as 


TE*31    =    exp(/33i+/532^2i) 
=    DE3±  ■  IE32i 

where  DE3i  =  exp(/93i)  and  IE32i  =  exp(/332/32i)-  Note  that  the  linear  specification 
given  for  f2(Yi),  as  was  seen  in  Equation  7.8  above,  is  merely  a  special  case 
of  the  general  model  given  in  Equations  7.1  -  7.5  where  f2  is  linear  and,  thus, 
^3/1/2(2/1)  =  /?2i-  Note  also  that  the  above  result  for  TE^  can  be  expressed  as 

TE*3l  =  eCOE, 

which  more  explicitly  displays  a  multiplicative  COE  (MCOE)  for  causal  chains 
involving  log-linear  fk  functions  and  displays  the  relationship  between  the  MCOE 
and  the  COC  in  classical  linear  path  models,  allowed  here  in  the  MCOE  text  due 
to  the  linear  form  of  f2. 

7.1.3    Impracticality  of  MCOE  in  Models  with  Only  Dichotomous  Variables 

Note  that  the  above  methodology  is  applicable  to  the  dichotomous-continuous- 
dichotomous  case(s)  mentioned  above,  but  is  not  appropriate  or  useful  when 
studying  cases  with  only  dichotomous  variables  included  in  the  path  model.  For 
such  cases,  the  methodology  implementing  the  Calculus  of  Finite  Differences  as 
presented  in  Chapter  6  is  most  appealing  and  appropriate.  We  now  show  that  the 
MCOE  is  unappealing  when  applied  to  path  models  solely  containing  dichotomous 
variables. 

Consider  a  three  variable  path  model  of  dichotomous  variables.  We  write 
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Yi    =    m  +  ei 
Y2    =    /aW  +  ea 
*3    =    /s(n,n)+€8, 

using  the  same  model,  notation  and  assumptions  as  that  presented  in  Section  6.2. 
Using  the  MCOE  methodology  presented  earlier,  we  must  evaluate  each  of  the 
quantities  E{Yz\Yl  =  1)  and  E(Y3\Yi  =  0).  Hence,  we  write 


Thus,  we  have  that 


£f2(y3|n  =  l)    =    /3(1,  l)/a(l)  +  /3(1, 0)[1  -  /2(1)]  (7.10) 

=    [Aw/3(1,  |ft)]/j(l)  +  /,(l,0).  (7.11) 

Likewise,  we  have 

££2(r3|n  =  l)    =    /3(0,l)/2(0)  +  /3(0,0)[l-/2(0)]  (7.12) 

=    [A,2/3(0,2/2)]/2(0)  +  /3(0,0).  (7.13) 

The  MCOE  methodology  would  then  divide  Equation  7.11  by  Equation  7.13,  which 
gives  an  unappealing  result  with  no  recognizable  or  interpretable  form. 

7.1.4    Futility  of  MCOE  in  Models  with  Nonadditive  Error  Terms 

The  derivation  above  and  the  resulting  MCOE  are  only  achieved  if  the  ep  term 
included  in  Yp  is  in  additive  form.  If  Yp  is  of  the  form  Yp  =  fp(Yu  Y2,  . . . ,  Fp_i,  ep) 
where  the  tp  term  is  included  in  the  functional  form  of  fp,  then  the  MCOE  result 
above  does  not  hold.  This  fact  can  be  seen  in  the  following  example. 
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Consider  the  following  three  variable  path  model: 

Yi    =    7ri  +  ei 

Y2      =      &0  +  ^21^1+62 

F3    =    exp^o  +  ^xil+^ia  +  ca). 

Note  that  in  the  model  given  above,  e3  enters  into  the  equation  for  Y3  within  the 
exponent,  and  not  additively  (outside  the  exponent)  as  was  seen  earlier.  Hence,  in 
the  formulation  of  Et2ttz{Yz\y\),  we  have  that 

Ee2,ea(Y3\yi)    =     J   ^{exp[/?30  +  h\Y\  +  Mfto  +  fti^i  +  ca)  +  e3]} 
■Pfalyi,  ^2)P(e2\yi)de2 

=      eXP[^30  +  &2&0  +  (ftl  +  &2#2l)il] 

•  /   J^  exp(/332e2  +  e3)  P(e3,  e2|s/i)  de2. 

J*     €3 

For  this  particular  model,  we  have  that 

Eeaies(Y3\Yi  =  1)  =  exp(/330  +  /332p20  +  fa  +  /?32/321)£62,e3|Fl=1[exp(/?32e2  +  e3)] 

and 

E<2,e*(Y*\Yi  =  0)  =  exp(/?30  +  /332^20)^e2>£3|yi=0[exp()932C2  +  <*)]■ 

Note  that  e3  is  a  part  of  the  exponential  term  in  each  of  these  two  expectations. 
This  was  not  the  case  in  the  earlier  example  where  e3  entered  additively  into  the 
expectation.  Hence,  since  e3  is  part  of  the  expectation  that  needs  to  be  evaluated 
and  since  e3  is  not  independent  from  the  previous  F's,  the  methodology  used 
to  develop  the  MCOE  is  not  appealing  in  this  situation.  In  particular,  using 
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the  MCOE  methodology  applied  to  this  example,  we  are  not  able  to  adequately 
evaluate  the  quantity 


TEtx 


E{YZ\YX  =  1) 

exp(/330  +  032)020  +  Ail  +  /^32/g2l)^2,63|yi^l[exp(^32€2  +  £3)] 

exp(#3o  +  ^32^2o)-^£2,63|yi=o[exp(/332e2  +  e3)] 

-       lexPtP31  +  P32P21 )    p r 73 j TT  • 

The  expectation  included  in  the  numerator  must  be  evaluated  with  respect  to 
the  probability  density 

P(e2,  ez\Yx  =  1)  =  P(es|yi  =  1,  e2)  Pfe^  =  1). 

Likewise,  the  expectation  in  the  denominator  must  be  written  with  respect  to  the 
probability  density 

P(e2,  f*\Yx  =  0)  =  P(e3|Fi  =  0,  e2)  P(e2|y1  =  0). 

Because  Yz  is  a  Bernoulli  random  variable,  the  quantities  P(e3\Yi  =  1,  e2)  and 
P{tz\Y\  =  0,  e2)  are  dependent  upon  the  specified  value  of  Yi  =  yx  and,  hence,  the 
two  conditional  values  are  not  equal  to  each  other.  Thus,  the  fractional  component 
in  the  above  TE  representation  does  not  equal  one.  Consequently,  even  the  MCOE 
methodology  is  not  appealing  or  useful  for  this  particular  special  case. 

7.2     Dichotomous  Outcome  Variable  with  Antecedent  Continuous  Variables 
Consider  the  case  of  a  p  variable  model,  written  as  follows: 
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Y\    =    A*i  +  ei 
Y2    =    V2  +  £2 

Yp-i    =    Vp-i  +  ep-i 
Fp    =    exp(r/p)  4-  cp 

where  r/fc  =  Y^fc/?  ,  fe  =  2,  3,  . . . ,  p.  We  make  the  following  assumptions  for  the 
above  model: 

1.  Yk,  k  =  1,  2,  . . . ,  p  —  1,  is  a  continuous  random  variable. 

2.  >p  is  distributed  as  a  Bernoulli  random  variable  with  mean  exp(?7p). 

3.  tf(ei)  =  £(e2|!/i)  =  •  •  •  =  E(ep\yi,  e2,  . . . ,  e^-i)  =  0. 

4.  efc,  fc  =  1,  2,  . . . ,  p  —  1  are  mutually  independent  as  well  as  independent  from 
any  previous  endogenous  variables. 

Note  that  ep  is  not  independent  from  e\,  e2,  •  ■  • ,  ep-i,  or  the  prior  endogenous 
variables,  due  to  the  fact  that  Yp  is  dichotomous. 
To  study  the  quantity  TEpi,  we  write 


TEtl  =  ±E(YpM 
where 


EYp\Yl(YP\yi)    =    EYl\YAU{r}P)  +  tp] 

=     /  exp{Vp)  ■  gkilvi)  dej 
because  ep  has  conditional  expectation  of  zero.  Hence,  we  have  that 
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TErl    =    ±E(YM 

-    -j—  I   exp(7/p)^(e/|2/1)o?e7 
=    ^7{^-[exp(r;p)]}. 

Note  that  we  can  interchange  the  derivative  and  expectation  immediately  above 
due  to  the  mutual  independence  of  the  error  terms  and  independence  from  Y\. 
Thus,  taking  the  derivative  with  respect  to  j/i,  we  write 


1  Jbpi    =    Ee    — — 

dr)p       dyi 

driP  „   <dexp(riP) 

dy1     -'       drip 

=    COCE^M) 

drip 

where  the  notation  lCOC  represents  the  Calculus  of  Coefficients  as  given  in 
Chapter  4.  This  form  is  similar  to  that  derived  in  the  case  of  strictly  continuous 
variable  path  models,  as  was  seen  in  Sections  5.1  and  5.6.  Hence,  the  quantity 
i?[exp(77p)]  can  be  estimated  via  MC  estimation  techniques.  Likewise,  using  the 
testing  methodology  presented  in  Chapter  5,  the  significance  of  the  direct  and 
indirect  effects  can  be  tested  as  shown  in  Section  5.5.4  since  E[exp(riP)]  ^  0.  Note 
that  this  result  can  be  extended  to  cases  where  the  quantity  TEpk\Ak  is  of  interest, 
letting  p  denote  the  last  variable  in  the  causal  chain  and  YAk  denote  the  variables 
antecedent  to  Yk,  again  using  methodology  presented  in  Chapter  5. 


CHAPTER  8 
APPLICATIONS 

8.1     An  Application  of  the  Nonlinear  Model  with  Continuous  Variables 
8.1.1     Introduction 

In  this  application  we  consider  the  causal  relationship  between  the  four 
endogenous  variables  mother's  age,  child's  birth  weight,  child's  length  of  stay 
in  the  hospital  at  birth  and  child's  mental  development  index  score.  We  also 
consider  one  exogenous  variable,  race  of  the  child,  which  can  assume  one  of  four 
possible  categories  (black,  hispanic,  white  or  other).  Of  particular  interest  in  this 
application  are  the  direct  and  indirect  effects  of  mother's  age  and  birth  weight  on 
child's  mental  development  and  if  these  effects  are  significant  when  child's  length  of 
stay  in  the  hospital  is  included  in  the  causal  chain.  We  assume  that  the  following 
model  holds: 


*i  =  Vi+2[!l1  +  €i  (8.1) 

Y2  =  /520  +  ^2i^i+X'72  +  e2  (8.2) 

Y3  =  explfa  +  foiYi  +  faYi+X'yJ  +  ea  (8.3) 

Y4  =  j3i0+/34iY1+pA2Y2  +  pA3Y3  +  X:li  +  e4  (8.4) 

where 

Yx  =  mother's  age  (MAGE), 

Y2  =  child's  birth  weight  (BW), 

Y3  =  child's  length  of  stay  in  the  hospital  at  birth  (LOS), 

F4  =  child's  mental  development  index  score  at  eighteen  months  of  age  (MDI), 
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X 


B 


x„  = 


XQ  = 
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X.  —(Xb,  Xfj,  Xo), 

1     if  race  of  child  is  'black' 

0  else, 

1  if  race  of  child  is  'hispanic' 

0  else, 

1  if  race  of  child  is  'other' 

0  else, 

and  the  e* ,  k  =  1,  2,  3,  4,  terms  follow  the  iV(0,  crjji)  distribution  and  are  mutually 

independent.  Note  that  omitting  the  indicator  variable  for  RACE='white'  amounts 

to  letting  the  'white'  category  serve  as  the  reference  category  throughout  the  study 

and  analysis. 

The  variable  Y\  was  measured  in  years,  with  values  ranging  between  19  and  50, 

inclusive.  (The  sample  was  selected  to  include  this  age  range  only  in  order  to  avoid 

extreme  collinearity  of  mother's  age,  education  level,  and  marital  status  among 

teenagers.)  The  variable  Y2  was  measured  in  units  of  grams,  with  values  between 

450  and  6000,  inclusive.  Variable  F3  was  measured  in  days.  Acceptable  values  for 

variable  Y4  were  between  50  and  150,  inclusive.  The  categories  of  the  exogenous 

variable,  RACE  were  black  (B),  hispanic  (H),  white  (W)  or  other  (O).  The  data 

set  to  be  analyzed  was  obtained  from  the  merged  data  sets  of  Florida  Birth  Vital 

Statistics  (VS)  and  Florida  Regional  Perinatal  Intensive  Care  Centers  (RPICC). 

(The  RPICC  centers  are  intensive  care  units  for  infants  who  are  premature  and/or 

sick  at  birth.)  Each  child  included  in  this  data  set  was  born  between  the  dates 

of  09/01/82  and  08/31/87,  inclusive.  Records  with  missing  values  for  any  of  the 

variables  included  in  this  model  were  deleted.  The  final  data  set  consisted  of  3, 197 

records  with  complete  data. 
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8.1.2     The  Calculus  of  Effects 

Under  the  assumptions  of  independent  and  normally  distributed  e^  terms  (i.e., 
the  6k  terms  follow  independent  Normal  distributions  each  with  mean  zero  and 
variance  al),  we  have 

as  was  seen  in  the  derivation  of  a  four  variable  path  model  and  presented  in  Section 
5.3,  Equation  5.14.  (We  focus  on  this  particular  quantity,  TE41,  here,  but  will 
discuss  other  total,  direct  and  indirect  effects  associated  with  this  causal  system  in 
subsequent  sections.)  From  Equation  5.20  we  have  that 

TEtl  =  «*.<{&)  + «„(g$)  +  «*.<gjj>  M 

,p       ,dh  gjj  df2 
e2'tAdyzdy2dyl)- 

Thus,  we  have 

TE41    =    &1+&2&1  (8.6) 

+&3&1  exp(/330  +  ^32/320  +  (All  +  ^32^2l)l/l  +  (#,272  +  73  )z) 
■^e2[exp(^32C2)] 

+^43^32^21  exp(^30  +  /332(320  +  (/931  +  fafojlh  +  (Ab72  +  73  )s) 

■E€2[exp(p32e2)] 

The  value  of  T£^41  consists  of  four  individual  terms  which  can  be  interpreted  as 
follows: 

1.  &i  =  DE41  =  DE  of  MAGE  on  MDI. 

2-  /?42/?2i  =  / £421  =  first  order  IE  of  MAGE  on  MDI  through  BW. 
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3.  AisAn  exp(/33o  +  Ab&o  +  (An  +  0&p2i)Vi  +  (A272  +  7s)s)  •JB£2[exp(/532C2)]  = 
IE43i  =  average  of  first  order  IE  of  MAGE  on  MDI  through  LOS. 

4.  A13A2A21  exP(&o  +  A32A20  +  (An  +  faPn)Vi  +  (A272  +  7s)z) 
•£,e2[exp(/332e2)]  =  IE4321  =  average  of  second  order  IE  of  MAGE  on  MDI 
through  BW  and  then  through  LOS. 

As  discussed  earlier  in  Section  5.3  when  deriving  the  COE  under  the  classical 
assumption  of  independent  error  terms,  even  though  this  is  a  nonlinear  model, 
we  have  shown  (in  Equation  5.20)  that  the  quantity  TE41  can  be  partitioned 
into  the  sum  of  one  expected  DE  and  expected  values  of  all  IEs  along  the  causal 
chain.  Again,  note  that  each  IE  can  itself  be  written  as  the  product  of  DEs.  More 
specifically,  consider  the  average  second  order  IE  of  MAGE  on  MDI  shown  directly 
above.  As  an  example,  consider  IE432i.  We  write 

IE4321    =    PuPnfoi  exp(/?3o  +  A2A0  +  (An  +  A2AO2/1  +  (A272  +  7s)z) 
•£e2[exp(A2e2)] 


DE43  ■  DE32  ■  DE21 


where 

DE43  =  A43 , 


££32  =  A.2  exp(Ao  +  A2A0  +  (An  +  AraA2i)yi  +  (A272  +  73)a:)  •  Ei2 [exp(A2e2)] , 
and 

DE21  =  An  • 

This  formulation  of  the  TE,  average  DE  and  average  IEs  directly  follows  the  COE 

formulation  in  Theorem  5.5.1. 
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8.1.3    Estimation  of  Effects 

For  the  specific  four  variable  model  under  consideration  in  this  section,  we  first 
must  obtain  estimates  of  the  ftkj  and  7,  quantities  in  Equation  8.6  above  in  order 
to  estimate  the  quantity  TE41.  These  estimates  are  obtained  by  independently 
modeling  each  of  Equations  8.1  through  8.4  in  SAS  [57]  using  PROC  GLM  or 
PROC  NLIN  [58].  After  the  individual  /3jy  and  7    parameters  have  been  estimated, 
the  value  of  E£2[exp(^32e2)]  in  Equation  8.6  above  must  be  estimated. 

To  estimate  the  expectation  above,  that  is,  to  estimate  the  quantity  £,£2[exp(/332e2)], 
we  must  first  estimate  a2  based  on  the  value  of  the  Mean  Square  Error  (MSE)  term 
for  the  model  involving  Y2  (BW),  that  is,  Equation  8.2.  Using  this  estimated  quan- 
tity of  o\,  n2  simulated  values  of  e2  were  generated  from  the  Normal  distribution 
by  using  the  RANNOR  [59]  function  in  SAS.  These  simulated  values  and  the  es- 
timated quantity  of  (332  obtained  from  the  estimated  equation  for  Y3,  that  is  /332  , 
then  were  used  to  calculate  the  quantity  exp(/332<T2).  If  there  had  been  more  than 
one  expectation  involving  e  terms,  independent  values  of  each  efc  would  have  been 
generated  from  the  appropriate  distributions  using  the  associated  estimated  values 
of<xfc2.       • 

Following  MC  integration  methodology,  a  MC  estimate  of  the  value  £e2[exp(/332e2)], 
using  the  n2  simulated  values  of  e2  and,  subsequently,  n2  values  of  exp(/332e~2),  was 
calculated  as 


-.        712 

££2[exp(/932e2)]  =  —  ^  exp(/532e2t) .  (8.7) 


n2 


8.1.4    Estimation  Results 


For  this  particular  example,  n2  =  1,  000,  000  values  for  e2  were  randomly  gen- 
erated from  the  normal  distribution  with  mean  zero  and  variance  a\  =  987, 162.77 
(obtained  from  the  value  of  the  Mean  Square  Error  (MSE)  from  the  estimated 
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Table  8.1:  /3kj  Parameter  Estimates 


Parameter 

Estimate 

SE 

£» 

2277.1323 

90.0682 

An 

-3.3151 

3.2075 

Aw 

4.0305 

0.0779 

Al 

0.0156 

0.0027 

Aa 

-0.0005 

0.0000 

Ao 

103.7954 

2.0893 

Al 

0.1161 

0.0645 

A2 

-0.0003 

0.0004 

A3 

-0.1592 

0.0102 

model  for  Y2).  From  these  observations,  and  using  the  estimated  value  of 
/#32  =  -0.0005  (dp32  =  0.0000)  the  following  MC  estimate  was  calculated  to 
be 


E[exp(fc2t2))    =    1.1307, 

with  a  MC  standard  error  of  0.5985. 

Table  8.1  gives  a  complete  list  of  the  estimated  (3kj  parameters,  with  the 

associated  estimated  standard  errors  (SE).  Table  8.2  displays  each  of  the  jk 

parameter  estimates  and  their  standard  errors,  where 

7fcB  =  the  value  of  7  in  the  fcth  path  equation  for  RACE  =  black, 
jkH  =the  value  of  7  in  the  kth  path  equation  for  RACE  =  hispanic, 
7ko  =the  value  of  7  in  the  kth  path  equation  for  RACE  =  other. 

Note  that  the  RACE  category  of  'white'  is  used  as  the  reference  category  in  all 

models.  Thus,  jkw  =  0  V  k . 

8.1.5    Partitioned  Effect  Estimates 

Table  8.3  shows  the  parameter  representation  for  each  effect  in  the  COE 
partitioning  of  each  TE,  along  with  the  effect  estimate  and  associated  p-value, 
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Table  8.2:  jk  Parameter  Estimates 


Parameter 

Estimate 

SE 

Tib 

-0.5661 

0.2113 

7ih 

0.4345 

0.0327 

7io 

0.1280 

1.0865 

72B 

-155.0819 

38.3423 

72H 

190.0878 

54.8726 

720 

113.4185 

196.9224 

73B 

-0.0253 

0.0317 

73tf 

0.0638 

0.0456 

730 

0.0007 

0.1629 

74B 

-4.7744 

0.7728 

74B 

-6.0313 

1.1054 

740 

-3.4897 

3.9594 

Table  8.3:  Effect  Estimates 


Variable  Effect 

Effect  Type 

Effect 

Effect  Estimate 

p- value 

MAGE-»BW 

DE21 

021 

-3.3151 

0.3014 

MAGE->LOS 

DE31 

/931e^*o+T*+^3*iJ'i  .E[e0*at*] 

0.3180e° om<"  ■  R.! 

0.0001 

MAGE->MDI 

DEu 

Pa 

0.1161 

0.0721 

MAGE->BW-> 
LOS 

IE32\ 

■E[e032e2] 

0.0338e00173^  •  i?x 

0.2892 

MAGE->BW-> 
MDI 

IE421 

1342/321 

0.0012 

0.5686 

MAGE->LOS-> 
MDI 

IE '431 

■E[e032e2] 

-0.0506e00173«  •  Rx 

0.0000 

MAGE->BW-> 
LOS->MDI 

IE4321 

/?43/932/32ie/33o+7*+/33iyi 
■E[e032e2] 

-0.0054e00173»1  •  i?! 

0.2984 

BW->LOS 

DE32 

a    eP3o+X'  23+03iyi+032y2 

-0.0281e0-0156»»-0-0005M  .R2 

0.0001 

BW-+MDI 

DE42 

042 

-0.0004 

0.3892 

BW->LOS-> 
MDI 

IE  432 

f3i3ft3ieP30+X'r3+f331yi+l332y2 

0.0045e00156!"-00005^  •  R2 

0.0000 

LOS->MDI 

i 

DE43 

A> 

-0.1592 

0.0001 
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calculated  from  the  test  statistic,  Z,  as  given  in  Theorem  5.5.2.  The  first  column 
of  Table  8.3  lists  the  path  of  variables  of  the  effect  of  interest.  In  the  'Effect  Type' 
column  displays  the  type  of  effect  (direct  or  indirect)  and  also  any  intermediate 
variables  associated  with  an  IE.  In  the  'Effect'  column  the  parametric  representa- 
tion of  each  effect  is  shown.  Note  that  in  this  column,  we  write 

$)0  —  ^30  +  ^32^20, 

7*  =  /532^'72  +  X!j^,  and 

&*1=&1+A»2/?21. 

Also,  in  the  'Effect  Estimate'  column,  the  quantities  Ri  and  R2  are  calculated 

by 

Rx  =  e? 
and 

R2  =  e*1. 

The  i?i  value  is  calculated  from  the  estimate  of  7*  in  the  'Effect'  column  of  Table 
8.3  and,  likewise  the  R3  value  is  calculated  from  the  estimate  of  X!l  ■  These 
quantities  result  from  estimation  of  7*  and  j^  and,  thus,  the  associated  parameter 
estimates  given  in  Table  8.2  and  are  based  on  the  appropriate  category  of  the 
exogenous  variable  RACE.  Hence,  we  have  that 

Ri  =  1.2626  and  R2  =  1.0321  if  RACE  =  black, 

Ri  =  0.8457  and  R2  =  1.0827  if  RACE  =  hispanic, 

Ri  =  0.9745  and  R2  =  1.1293  if  RACE  =  other,  and 

Ri  =  1  and  R2  =  1  if  RACE  =  white. 

The  above  listed  effect  estimates  listed  in  Table  8.3  can  be  used  along  with  the 
COE  of  Theorem  5.5.1  to  calculate  the  total  effect  estimates  TE41,  TE42,  TE43, 
TE 3i,  TE32  and  TE21.  For  example,  two  specific  quantities  of  interest  are 
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TE41    =    E(DE41)  +  E(IE421)+E(IE431)  +  E{IE4321) 

=    0.1161  +  0.0012  +  (-0.0506e00173!/1  -  0.0054e00173yi)  •  Rx 
=    0.1173  -  0.0560e-°0173j/1  •  i?: 


and 


TE31    =    E(DE31)  +  E(IE321) 

=    (0.3180e001732/1  +  0.0338e00173!/1)  •  Rx 
=    0.3518e-°0173z/1  •  i?! . 

8.1.6    Testing  of  Effects 

Recall  that  this  example  makes  the  assumptions  of  independent  and  normally 
distributed  ek,  thereby  allowing  the  theory  presented  in  Sections  5.5  and  5.5.4  to  be 
implemented.  Utilizing  the  notation  t]k  =  Y'A  B  ,  k  =  1,  2,  3,  4,  to  represent  the 
kth.  linear  predictor,  suppose  we  wish  to  test  the  significance  of  the  DE  of  MAGE 
on  LOS  in  the  applied  example  presented  throughout  this  section.  That  is,  suppose 
we  wish  to  test  the  significance  of  the  quantity 


r  %  fas)  i 


DE3l    =    /?31£y-^] 
drj3 

where  773  =  Y^/^.  From  Theorem  5.5.2,  testing  the  hypothesis 


H0:    DE3l    =0 
#1  :    DE3l    ±  0 


127 
is  equivalent  to  testing  the  hypothesis 

H0:    &1    =0 

Using  the  parameter  estimates  and  their  estimated  standard  errors  from  Table 
8.1,  the  test  statistic  given  by  Theorem  5.5.2  is 


P: 
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y/Var(fai) 
0.0156 
0.0027 
=    5.7778. 

Thus,  the  p-value  of  the  hypothesis  test  is  p  =  0.0001  and  we  conclude  that,  at 
a  =  0.05,  there  is  a  significant  direct  effect  of  MAGE  on  LOS. 

Likewise,  consider  testing  the  significance  of  7E32i.  To  test  this  effect,  we  must 
perform  a  hypothesis  test  of 


H0  :   IE32i    =    0 
Hi  :   7^321     #    0 

which,  by  Theorem  5.5.2,  is  equivalent  to  testing 


H0:   /?32^2i     =    0 
Hi  :   &2&1     ^    0. 

The  test  statistic  from  Theorem  5.5.2  is 


128 


7  fiyifiix 

^32/?21 

From  the  parameter  estimates  and  their  associated  estimated  standard  errors  given 
by  Table  8.1,  we  have  the  following: 

#52  =  -0.0005,  Var032)  =  0 
and 

#>i  =  -3.3151,  Var(^2i)  =  3.20752. 

To  calculate  the  denominator  of  the  test  statistic,  we  recall  Equation  5.49,  that  is, 
we  write 


Var{      J]       &*,)=       II      \y°r&*)  +  ftiA-       U       Mk>- 

(k'j')a(Q)k  (k>,v)ei(Q)k  (k',i')ei(Q)k 

Hence,  we  have  that 


=    [0  +  (-0.0005)2][3.20752  +  (-3.3151)2]  -  [(-0.0005)2(-3.3151)2] 
=    2.572014 -(10-6). 

Therefore,  the  value  of  the  test  statistic  is 
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(-0.0005)(-3.3151) 

0.0016 
0.0017 


0.0016 
=    1.06. 

Hence,  p  =  0.2892  and,  at  a  =  .05,  we  fail  to  reject  the  null  hypothesis  of 
H0  :  lE321  =  0. 

8.1.7    Substantive  Interpretations  of  Effects 

There  are  several  main  points  that  arise  when  interpreting  the  effects  studied 
in  this  particular  example.  First,  there  is  no  significant  direct  effect  of  MAGE 
on  MDI  (p  =  0.0721).  More  specifically,  if  the  medical  and  biological  effects  of 
increasing  MAGE  are  expected  to  be  attained  through  BW  and  LOS,  then  the 
DE  of  MAGE  on  MDI  would  represent  the  environmental/social  aspects,  perhaps, 
such  as  higher  parity  and,  hence,  diversion  of  attention,  or  less  nurturing  as  the 
mother  grows  older.  This  lack  of  a  significant  DE  suggests  that  the  entire  effect 
of  MAGE  on  MDI  is  indirect  through  the  biological  and  medical  variables,  as 
reflected  by  BW  and  LOS.  Thus,  there  is  little  evidence  for  an  environmental/social 
effect  of  mother's  age  on  the  child's  mental  development  within  the  RPICC 
population  of  premature  and/or  sick  infants.  Other  effects  in  Table  8.3  were 
similarly  tested.  The  results  suggested  that  the  only  significant  effect  of  MAGE  on 
MDI  occurred  indirectly  through  LOS  (p  =  0.0000),  which  is  perhaps  indicative 
of  other  complications  at  birth,  such  as  congenital  anomalies  due  to  increasing 
mother's  age. 

Also,  there  is  not  a  significant  DE  of  BW  on  MDI  (p  =  0.3982),  which  is 
indicative  of  the  fact  that  BW  alone  does  not  affect  MDI.  However,  there  is  a 
significant  IE  of  BW  on  MDI  through  LOS  (p  =  0.0000),  thereby  indicating  that 
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the  effects  of  BW  on  MDI  are  due  only  to  those  effects  associated  with  low  birth 
weight  that  cause  extended  stay  at  the  hospital  due  to  associated  conditions  and 
their  treatment  at  birth,  such  as  underdeveloped  organs  (for  example,  lungs),  birth 
defects,  and  the  need  for  ventilation. 

It  should  also  be  noted  that  there  is  not  a  significant  DE  of  MAGE  on  BW 
(p  =  0.3014)  within  the  RPICC  population.  This  fact  indicates  that  there  is  not 
a  trend  among  older  mothers  to  deliver  lower  birth  weight  infants  than  younger 
mothers.  This  result  is  contradictory  to  studies  and  results  presented  in  most  of 
the  literature  where  the  general  population  is  under  study  (Ventura  et  al.  [68,  69], 
Nolan  and  Magee  [52],  Johnson  et  al.  [33]).  There  is,  however,  a  significant  DE 
of  MAGE  on  LOS  (p  =  0.0001).  This  significant  DE  indicates  that,  within  the 
population  of  RPICC  infants,  there  is  an  increase  in  LOS  as  MAGE  increases, 
when  controlling  for  BW.  This  is  probably  a  reflection  of  the  higher  incidence  of 
congenital  anomalies  such  as  Down's  Syndrome,  among  infants  of  older  mothers. 

In  conclusion,  the  only  significant  average  IEs  in  the  causal  model  were  from: 
(1)  BW  on  MDI  through  LOS,  that  is,  TE432;  and  (2)  MAGE  on  MDI  through 
LOS,  that  is,  IE^zi-  More  importantly,  we  find  that  LOS  is  the  only  variable  that 
exerts  a  DE  onto  MDI.  Hence,  to  improve  MDI  scores  among  children,  one  must 
primarily  focus  on  the  associated  causes  of  LOS. 

8.2     Proposed  Application  of  the  Dichotomous  Variables  Model  to  Alzheimer's 

Disease 

Alzheimer's  Disease  (AD)  is  the  primary  cause  of  dementia  in  the  elderly  and 

affects  approximately  fifteen  million  people  worldwide  (Honig  and  Mayeux  [32]). 

Four  million  Americans  are  afflicted  with  AD.  One  in  ten  persons  over  sixty-five 

have  the  disease.  The  average  lifetime  cost  per  Alzheimer's  patient  in  the  U.S.A. 

is  $174,000.  AD  costs  the  U.S.A.  at  least  $100  billion  per  year.  Neither  Medicare 
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nor  most  private  health  care  policies  cover  the  long-term  care  that  most  patients 
require  (Alzheimer's  Association  [2]). 

AD  is  a  neurodegenerative  disorder  characterized  by  accumulation  of  forms 
of  the  neurotoxic  40-  and  42-amino  acid  A  beta  peptides  (A  beta  40  and  A  beta 
42,  respectively)  (Petanceska  et  al.  [54]).  The  Amyloid  precursor  protein  (APP) 
gene  is  the  source  of  these  A  beta  peptides  (Koldamova  et  al.  [40]).  The  hormone 
estrogen  has  several  properties  that  are  thought  to  regulate  the  ill  effects  of  the  A 
beta  peptides,  and,  thus,  reduce  the  risk  of  AD.  Estrogen  exerts  a  neuroprotective 
factor  via  17  beta-estradiol  (E2)  (Behl  and  Manthey  [4],  Petanceska  et  al.  [54]). 
Estrogen  regulates  beta  APP  metabolism  and  the  repair  of  neuron  receptors  and, 
also,  improves  blood  flow  in  regions  of  the  brain  affected  by  AD  (Gouras  et  al.  [21], 
Greene  [27]). 

Postmenopausal  women  make  up  approximately  ten  percent  of  the  world 
population  (Ringa  [56]).  Also,  western  women  spend  one-third  of  their  life  in  a 
postmenopausal  state  and,  hence,  in  estrogen  deficiency  (Palacios  [53]).  It  is  this 
estrogen  shortage  that  is  believed,  in  part,  to  account  for  the  fact  that  women  are 
three  times  more  likely  than  men  to  develop  AD  (Greene  [27]).  Thus,  replacing 
and  supplementing  the  naturally  occurring,  endogenous  estrogens  with  hormone 
replacement  therapy  (HRT)  has  drawn  increased  attention,  both  in  the  medical 
field  and  in  the  media  (Cowley  [9],  Greene  [27],  Lambert  [41],  Slooter  [60]). 

There  are  also  other  factors,  besides  estrogen,  that  are  believed  to  contribute 
to  increased  or  decreased  risk  of  AD  among  women.  Some  of  these  factors  are 
age  at  menopause  (decreased  risk  of  AD  with  increasing  age  at  menopause), 
type  of  menopause  (decreased  risk  of  AD  with  natural  menopause)  (Geerlings 
et  al.  [18],  Petanceska  et  al.  [54]),  vasomotor  symptoms,  that  is,  "hot  flushes", 
(decreased  risk  of  AD  with  decreased  occurrence  of  "hot  flushes")  (Pritchard  [55]), 
use  of  oral  contraceptives  (decreased  risk  of  AD  with  use  of  oral  contraceptives), 
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education  level  (decreased  risk  of  AD  with  increased  education  level),  smoking 
habits  (decreased  risk  of  AD  with  nicotine  use)  and  race  (decreased  risk  of  AD  with 
race  of  white)  (Harlow  and  Signorello  [29],  Letenneur  et  al.  [43],  Tang  et  al.  [64]). 
Another  factor  believed  to  be  crucial  to  the  onset  of  AD  is  related  to  the  gene 
Apolipoprotein  E  (apoE),  which  has  three  major  isoforms.  They  are  apoE2,  apoE3 
and  apoE4  (Malley  and  Rail  [45]).  The  risk  of  AD  is  more  pronounced  among 
apoE4  carriers  (Tol  et  al.  [66],  Uchida  et  al.  [67],  Zuliani  et  al.  [77]). 

There  is  an  abundance  of  studies  relating  AD  to  estrogen  and/or  HRT. 
However,  none  of  these  studies  viewed  the  relationships  among  variables  as  a 
causal  path  model.  Hence,  none  of  the  studies  attempt  to  view  the  effects  of  risk 
factors  as  combinations  of  direct  and/or  indirect  effects  on  the  outcome  of  AD 
and,  thereby,  discovering  if,  for  example,  the  effect  of  oral  contraceptives,  is  solely 
an  indirect  effect  through  other  variables  later  in  the  causal  chain,  or  if  there  is  a 
direct  effect. 

Using  the  methodology  developed  in  this  dissertation,  the  chain  of  variables 
affecting  AD  could  be  viewed  as  a  path  model  and  analyzed  as  such.  One  proposed 
analysis  is  to  view  the  chain  as  follows,  where,  as  presented  in  earlier  chapters, 
the  X  variables  represent  exogenous  variables  and  the  Y  variables  represent  the 
endogenous  variables  in  the  causal  chain: 

Xi  =race  (RACE), 

X2  =education  level  (EDL), 

Yi  =apoE4  marker  (E4), 

Y2  =use  of  oral  contraceptives  (OC), 

*3  =type  of  menopause  (TM), 

Y4  =hormone  replacement  therapy  (HRT), 

Y5  —  Alzheimer's  Disease  (AD). 
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Note  that  the  first  endogenous  variable  in  the  causal  chain,  that  is,  Yi,  is  the 
apoE4  marker,  which  is  a  genetic  marker  obtained  (or  not)  at  conception  and, 
hence,  is  placed  first  in  the  causal  ordering.  From  the  methodology  presented  in 
Chapter  6,  the  causal  chain  given  above  can  be  analyzed  as  a  dichotomous  path 
model  where  each  of  the  endogenous  variables  takes  on  one  of  two  possible  values, 
such  as  "yes"  (=1)  or  "no"  (=0).  An  analysis  such  as  this  could  be  beneficial  to 
the  ongoing  research  relating  the  apoE4  marker  with  HRT  and  AD.  With  this 
path  model  analysis,  the  severity  of  the  DE  of  E4  on  the  outcome  of  AD  could  be 
analyzed,  as  well  as  the  IE  of  E4  on  AD  through  such  intervening  variables  as  TM 
and  HRT.  Thus,  it  could  be  determined  if  the  effect  of  E4  on  AD  is  primarily  a 
direct  effect  or  is  partly  an  indirect  effect (s)  through  the  intermediate  variables. 

8.3    Proposed  Application  of  the  Dichotomous-Continuous-Dichotomous  Case  to 

Surrogate  Variables 

We  now  discuss  the  results  presented  in  Section  7.1  as  applied  to  problems 
involving  surrogate  variables,  as  was  previously  discussed  in  Section  1.  We  consider 
first  a  basic  three  variable  path  model  where 

Yi  =treatment  factor  or  risk  factor  of  interest, 

Y2  =potential  surrogate  variable,  and 

Y3  =outcome  variable  of  interest. 
More  specifically,  we  consider  the  setting  previously  presented  in  Section  7.1.2 
where 

Yi  =hypertensive  drug  treatment, 

Y2  =blood  pressure  level,  and 

Y3  =occurrence  of  stroke. 

To  study  problems  such  as  these,  we  make  use  of  the  following  definition 
(Carter  and  Johnson  [8]). 
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Definition  8.3.1.     When  studying  the  effect  ofY\  on  Y3,  Y2  is  a  perfect  surro- 
gate for  Y3,  if 

h(Yi,  Y2)  =  h(Y2),  and 

Var(e3)  =  0. 

These  conditions  mean,  respectively,  that 

1.  Yi  only  influences  E(Y3\yu  y2)  indirectly  through  it's  influence  on  Y2,  that  is, 
there  is  no  DE  of  Y\  on  Y3.  Hence, 

2.  Y3  is  totally  determined  by  Y2  and  the  form  of  h. 

The  magnitude  of  deviation  from  condition  (2)  can  be  measured  using  an  R2  type 
measurement.  This  measurement  is  written  as  (Carter  and  Johnson  [8]) 

— 

c2  =  {\-  E^~/3(j/2_:\f)]2}  x  100%. 

Values  of  this  measure  that  are  close  to  100%  suggest  that  the  variance  of  e3  is 
approximately  zero. 

The  magnitude  of  deviation  from  condition  (1)  above  in  the  definition  of  a 
perfect  surrogate  can  be  measured  by 

C  =  ffi  X  100%, 

where  the  definitions  of  IE  and  TE  will  vary  depending  on  the  special  case  under 
consideration,  based  on  the  cases  considered  throughout  this  paper  and  their 
associated  methodology.  More  explicitly,  IE  and  TE  as  mentioned  above  may  be 
defined  as  derivatives,  difference  quotients  or  ratios,  as  proposed  in  this  dissertation 
and  deemed  necessary  by  the  variables  and  model  under  consideration.  Again,  also 
note  that  values  of  d  close  to  100%  suggest  that  Yx  influences  Y3  only  indirectly 
through  Y2. 

Recall  from  Sections  7.1.1  and  7.1.2  that,  for  the  particular  special  case  where 
Fi  =treatment  with  hypertensive  drug  is  represented  by  either  "yes"  (=1)  or 
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"no"  (=0),  that  is,  Yi  is  a  dichotomous  variable,  Y2  is  continuous  and  Y3  is  also 
dichotomous,  we  can  write 

TElx  =  DE3i  ■  IE321  ■ 

Therefore,  for  the  specified  functional  forms  given  in  Section  7.1.2  for  Yi,  Y2,  and 

Y3,  we  have  that 

d  =  j±r-  x  100%. 
Dan 

Hence,  we  can  measure  the  strength  of  the  proposed  surrogate  variable  by  cal- 
culating a  DE.  If  the  above  measurement  of  C\  is  close  to  100%,  then  h(Y2) 
[=  /3(Yl5  Y2)\  can  be  estimated  in  a  separate  study,  ignoring  Yi.  Thus,  we  can 
use  h(Y2)  in  place  of  /3(Y1?  Y2),  together  with  f2{Yi)  from  the  current  study.  To 
determine  if  Y2  is  an  acceptable  surrogate  variable  for  Y3,  both  C\  and  C2  together 
should  be  considered.  If  it  is  deemed  that  Y2  is  a  good  surrogate,  we  may  be  able 
to  study  the  effects  on  the  outcome  variable  of  interest  in  a  shorter  period  of  time, 
thereby  reducing  the  time  to  market  for  effective  drugs. 

Note  also,  that  Yx  may  be  measured  as  a  continuous  variable,  that  is,  the 
exact  dosage  of  the  hypertension  drug  of  interest  may  be  available.  Hence,  the 
methodology  given  in  Section  7.2  would  be  implemented. 


CHAPTER  9 
FUTURE  WORK  AND  CONCLUSIONS 

9.1     Inferential  Procedures  for  Models  Containing  Dichotomous  Variables 
The  COE  was  developed  For  models  with  dichotomous  variables,  a  COE  par- 
titioning of  TEik\Ak  was  developed  in  Chapter  6.  Due  to  complications  that  arise 
in  estimation  of  the  average  direct  and  indirect  effects  in  such  a  model,  estimation 
and  testing  procedures  will  be  developed  in  future  work.  Two  approaches  at  such 
estimation  and  testing  procedures  will  be  investigated. 

When  each  ft  mean  function  is  modeled  as  a  function  of  a  linear  predictor,  jjj, 
traditional  estimation  methods  can  be  implemented.  However,  testing  products  of 
the  coefficients  in  the  linear  predictor  becomes  difficult  in  this  situation,  as  opposed 
to  models  containing  only  continuous  random  variables,  due  to  the  lack  of  a  chain 
rule  for  dichotomous  variables.  More  specifically,  if  Yk  is  continuous,  then  the 
MVCR  can  be  applied  to  obtain 

dyk       dr]i  dyk ' 
There  is  no  analog  of  the  chain  rule  in  the  Calculus  of  Finite  Differences  to  allow 
such  a  decomposition  of  the  quantity  A^/^r/,).  One  attempt  at  alleviating  this 
difficulty  will  be  to  consider  writing 


realizing  that  this  is  not  theoretically  or  mathematically  accurate,  but  is  a  heuristic 
approach.  Future  work  will  be  to  theoretically  justify  this  method,  and,  thus,  apply 
Theorem  5.5.2  for  hypothesis  testing. 
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Also,  nonparametric  estimation  of  the  /(  mean  functions  will  be  explored  using 
sample  proportions.  Estimation  of  difference  quotients  of  these  sample  propor- 
tions is  not  difficult,  however,  testing  the  significance  of  IEs  requires  averaging 
products  of  these  difference  quotients,  which  is  quite  complicated  due  to  the  dif- 
ferent sub-populations  that  will  be  encountered  as  the  number  of  intermediate 
variables  within  a  causal  chain  grows.  That  is,  one  must  know  the  distribution  of 
Y/|Y^,  Yk  for  every  possible  set  of  conditioning  variables.  Hypothesis  testing  also 
will  require  development  of  the  asymptotic  distribution  of  the  estimated  average  of 
products  of  the  difference  quotients  involved  in  each  IE,  as  well  as  the  estimated 
standard  error. 

9.2     Nonlinear  Models  with  p  Continuous  Variables  and  Nonindependent  Error 

Terms 

By  viewing  the  total  effect  of  one  endogenous  variable  upon  another  variable, 
within  a  linear  or  nonlinear  path  model  with  continuous  variables  and  classical 
assumptions,  as  the  derivative  of  the  conditional  expectation  (after  eliminating 
intermediate  variables),  a  TE  can  be  decomposed  into  a  sum  of  the  average  DE 
and  average  IEs  through  the  intermediate  variables,  which  can  then  be  written  as 
conditional  expectations  of  products  of  derivatives.  These  conditional  expectations 
of  products  are  analogous  to  the  IEs  in  the  COC.  Both  this  COE  partitioning  and 
the  COC  partitioning  for  linear  models  result  from  an  application  of  the  MVCR. 

In  this  dissertation,  we  have  derived  the  COE  and  presented  estimation 
techniques  and  testing  of  the  estimators  of  the  expected  direct  and  indirect  effects 
included  in  the  COE.  Future  work  will  focus  on  developing  confidence  intervals  for 
the  expected  direct  and  indirect  effects  in  the  COE. 

Also,  future  work  proposed  here  will  be  to  obtain  additional  analogs  of  the 
COC  to  causal  chains  involving  more  general  models.  The  current  work  presents 
a  conceptualization  of  classical  path  analysis,  and  its  well-known  COC,  that 
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generalizes  to  at  least  some  nonclassical  models.  Future  research  will  consist 
of  studying  more  general  nonclassical  models,  with  special  emphasis  on  the 
exponential  family  of  distributions,  with  both  continuous  and  discrete  endogenous 
variables.  Classes  of  distributions  need  to  be  studied  to  determine  more  general 
cases  when  the  second  integral  term  in  Equation  5.33  becomes  zero.  When  this 
term  does  not  equal  zero,  special  classes  of  distributions  will  be  investigated  to 
analyze  this  second  term. 

More  specifically,  consider  again  the  model  involving  four  endogenous  vari- 
ables, as  given  previously  in  Equations  5.6-5.9,  with  no  assumptions  regarding 
independence  of  the  error  terms,  efc.  Recall  that  we  can  write 


TEti  =  5W 

dyi 

=     /     /    -A  ■  Pea.eslYk (e2>  e3\y1)de2  de3  (9.2) 

+  i  i  f .  fes^sw^ 

Jv  Je3  ay! 


where 


&2,e3|yi(e2,  e3|2/i)    =    &3|£2in(e3|e2)  2/1)  ■  gea\Yifa\yi) 

and  the  distributions  of  e2  and  e3,  conditional  on  any  previous  ek  quantities  and 
Yi  =  Vi,  are  from  the  centered  regular  exponential  family  (McCullagh  and  Nelder 
[47]).  That  is,  the  distribution  of  e2  conditional  on  Yx  =  yx  can  be  written  as 

9e2\vA^\yi)  =  exp{[(e2  +  b'2(92))62  -  b2(92)}/a2((j>2)  +  c2(e2  +  b'2(92),  4>2)}       (9.3) 
where 
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b'2(d2)  =  E(Y2\Y1  =  y1)  =  ti2, 

^{02)a2(<t>2)  =  V{Y2\YX  =  Vl)  =  V2  and 
b2(62)  is  a  function  of  Yi  =  2/1  •  The  density  function  g€3\e2,Yi(^3\^2,  V\)  can  be 
written  in  a  similar  fashion. 

The  TE  expansion  given  above  in  Equation  9.2  can  be  investigated  further 
using  the  density  give  in  Equation  9.3  to  determine  the  form  of  the  TE  under 
nonindependence  of  the  error  terms  within  the  exponential  family.  Also,  other 
discrete  variable  models  within  the  exponential  family  can  be  studied  using  the 
Calculus  of  Finite  Differences,  as  was  shown  in  Chapter  6  for  the  dichotomous 
variables  case. 

9.3     Generalized  Discrete  and  Mixed  Models 
Again,  considering  a  four  variable  model  and  writing  the  integrals  as  Lebesgue- 
Stieltjes  integrals  (Billinsgley  [6]),  which  are  applicable  to  both  continuous  and 
discrete  variables,  the  Calculus  of  Finite  Differences  can  be  applied  to  write 

E(Yt\YAk  =  yAk,  Yk  =  yk)  =  ££/[/((y^,  ft,  Y,)]. 

That  is,  we  can  express  the  multiple  Lebesgue-Stieltjes  integrals  involved  in  the 
above  expectation  as  a  combination  of  m  =  I  -  (k  +  1)  Reimann  integrals  or  sums, 
depending  on  the  continuity  or  noncontinuity,  respectively,  of  the  m  intermediate 
variables. 

We  will  investigate  the  possibility  of  formulating  a  general  analog  of  derivatives 
for  these  causal  chains  involving  both  continuous  and  discrete  variables  such  that 
the  total  effect  equals  the  derivative  of  a  conditional  expectation.  We  will  attempt 
to  show  that  the  conditional  TE  equals  the  Lebesque-Stieltjes  integral  (Billingsley 
[6])  of  the  general  analog  of  a  derivative,  denoted  A*,  integrated  with  respect  to 
the  intermediate  variables  between  Yk  and  YJ  and  conditioned  on  all  variables 
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antecedent  to  the  specific  variable  of  interest.  The  notation  of  A*  is  used  to  denote 
an  actual  derivative  if  the  Yk  variable  of  interest  is  continuous,  but  denotes  a 
difference  quotient  if  the  Yk  variable  is  discrete. 

Explicitly,  using  the  notation  from  previous  chapters,  for  any  particular  Yt  we 
can  determine  the  TE  of  Yk  on  Yj,  for  any  k  and  /  such  that  1  <  k  <  I  <  p,  by 

AWibM.»)    =    A^^iy^yJ^iY^n.Yj/Kyx.yfc.Y^  +  e,]}     (9.4) 

where,  again,  Y^  collectively  represents  all  variables  antecedent  to  Yk  in  the 
causal  chain  and  Y/  collectively  represents  all  variables  intermediate  to  Yk  and 
Yi  in  the  chain.  The  A*  notation  is  being  used  here  to  represent  a  general  analog 
of  a  derivative,  depending  on  whether  the  variables  included  in  Y/  are  discrete 
or  continuous.  More  explicitly,  if  a  particular  Y}  variable,  say  Yj,,  is  continuous, 
then  the  expectation  for  Ytl  must  be  computed  by  the  usual  Riemann  integration. 
However,  if  another  intermediate  variable,  say  Yi2  variable  is  discrete,  then  the 
expectation  for  Yj2  should  be  computed  via  summation.  Thus,  the  expectations 
written  above  will  be  combinations  of  integrals  and  summations.  Also,  note 
once  again  that,  in  situations  such  as  these,  where  we  must  account  for  both  the 
antecedent  and  intermediate  variables,  the  methodology  will  be  to  condition  on  the 
antecedent  variables,  Y^,  and  to  average  over  the  intermediate  variables,  Y/. 

Equations  such  as  9.4  above  will  be  fully  investigated  to  determine  if  an  analog 
to  the  COC  exists  in  such  cases.  Also,  we  will  examine  cases  where  the  probability 
mass  function  takes  on  a  specified  form,  as  was  considered  in  the  Chapter  6  in 
the  case  of  models  containing  only  dichotomous  variables.  That  is,  we  will  study 
specific  cases  to  determine  if  the  COE  partitioning  holds  if  the  probability  mass 
function  takes  on  other  designated  forms. 
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